MATHEMATICS 


N THE BEHAVIORAL AND SOCIAL SCIENCES 



MATHEMATICS IN THE 
BEHAVIORAL AND SOCIAL SCIENCES 

JOHN W. BISHIR/DONALD W. DREWES 

North Carolina State University 


MLSU - CENTRAL LIBRARY 



63970CL 


m 

HARCOURT, BRACE & WORLD. INC. 
New York / Chicago / San Trancisco / Atlanta 




To Mary and Betty 


© 1970 by Harcourt, Brace & World, Inc 

All rights reserved No part of this publication may be reproduced or 
transmitted in any form or by any means, electronic or mechanical, including 
photocopy, recording or any information storage and retrieval system, 
without permission in writing from the publisher 

ISBN 0 15 555251-1 

Library of Congress Catalog Card Number 70 113704 
Printed in the United States of America 


PART l Paul Klee Disturbed Equilibrium 1922 Paul Klee Foundation 
Museum of Fine Arts, Berne (photo by Museum of Modern Art, New 
York) 

PART II Paul Klee Centrifugal Forces 1929 Paul Klee Foundation 
Museum of Fine Arts, Berne (photo by Museum of Modern Art, New York) 

PART 111 Paul Klee Inscription 1918 Felix Klee 

PART IV Paul Klee Monument in the Orchard 1929 Paul Klee Founda- 
tion Museum of Fine Arts, Berne 


For all the above, permission S P A D E M 1970 by French Reproduction 
Rights, Inc 



PREFACE 


JThe utility of scientific knowledge is judged according to the degree to which 
it permits understanding of past events and prediction and control of future 
events Description of events, being a historical record of past experience, 
cannot be used directly for future prediction and control Rather, expecta- 
tions of future events are generated within a logical system consisting of two 
or more abstract variables and the rules for specifying their interdependence 
This logical system is called a model ) 

(A. logical system in which the variables are mathematical symbols and the 
structural relations between variables consist of a set of equations is called a 
mathematical model Formulation of models m mathematical rather than 
literary or verbal form is often advantageous in that (1) mathematical lan- 
guage is often more concise and analytic than verbal language, (2) the 
relations between variables must be explicitly stated, thereby facilitating 
public scrutiny and evaluation of explanations, (3) the model builder has 
access to supportive mathematical theories, and (4) an explanatory structure 
can be readily generalized to the //-variable case 

This book introduces the mathematical methods most frequently used ir} 
the behavioral and social sciences} The mathematical content follows closely 
the recommendations made by the Committee on the Undergraduate 
Program in Mathematics of the Mathematical Association of America for 
students (m the behavioral, management, and social sciences^ Whenever 
possible, usage of mathematical content is illustrated by models drawn from 
the scientific literature In other cases, hypothetical models are offered 
which attempt to capture the essence, if not the actuality, of the use of 
mathematical models 

The book is written for both those reasonably competent in mathematics 
who are seeking to relate mathematics to the social and behavioral sciences 
and those who have yet to develop their mathematical skills Admittedly the 
task is easier for the former To make it more feasible for the latter, the 
material is presented so as to facilitate an intuitive understanding of the 
mathematical concepts that govern useful techniques in the analysis of 
social and behavioral problems 


PREFACE / 



For those interested m pursuing applications in specific content areas 
literature references are footnoted within the textual material Mathematical 
readings are listed at the end of each chapter 
Topics in the book are arranged according to a meaningful order of 
mathematical material rather than a hierarchy of models Part I considers 
the finite mathematics relevant to the concept of system as a group of related 
entities Matrices and linear algebra are discussed in Part II and applied to 
the analysis of systems containing n variables Part III introduces differential 
and integral calculus and includes a parallel discussion of difference equations 
and their use in studying the dynamics of system behavior Part IV deals 
with probability theory and its application to the development of random 
models in the social and behavioral sciences 
For each mathematical topic, we have included examples and problems 
from each discipline in proportion to usage For instance, there are relatively 
more examples and problems drawn from classical economics in Part III 
(Calculus), whereas Part IV (Probability) contains proportionately more 
material drawn from psychology and management science 
The book contains more than 1500 problems, some within the mam 
textual discussion to allow immediate verification of understanding In 
addition, most sections contain problems which call for “proofs ” Although 
these problems do not contain results required for an understanding of later 
material, it is our feeling that such problems facilitate the development of 
mathematical skills 

The following diagram shows how the book may be used in one-, two-, or 
three-semester courses A solid line indicates dependence on the preceding 
chapter, a dashed line indicates use of only a portion of the preceding 



vi / PREFACE 



material For example, a one-semester course emphasizing matrix algebra and 
its applications would survey the necessary parts of Chapters 1 through 6 
before concentrating on Chapters 8 through 10 A one-semester course 
emphasizing basic discrete probability can be taught using parts of Chapters 
1 through 7, together with Chapters 18 and 19 (In each of these cases, some 
selection is necessary from Chapters 1 through 7 In our experience, a 
typical class will be able to cover 200 text pages when meeting three times 
per week for one semester ) 

A two-semester course emphasizing the general ideas of “finite mathe- 
matics” including matrix algebra and Markov chains would use Chapters 1 
through 9 and 18 through 20 (The material need not be covered in order ) 
A two-semester course emphasizing techniques of calculus consists of 
Chapters I through 6, and 11 through 17, with options of including none, 
some, or all of Chapters 13, 15, 16, and 17 Finally, the entire book may be 
covered in three semesters In this case, the large number of problems and 
the presence of Chapters 10, 17, and 21 allow considerable latitude for the 
individual preferences of the instructor 

It is a pleasure to thank the many people who have contributed so much 
to the completion of this book Special thanks go to Mrs Carol Little, Mrs 
Debra Curnn, and Mrs Sherry Ford for their many hours of careful typing, 
to Miss Jane Woodbndge of Harcourt, Brace and World for a superb job 
of editing, to Dr Nicholas Rose, Chairman of the Department of Mathe- 
matics at North Carolina State University, for his assistance and encourage- 
ment, to Professor David Rosen of Swarthmore College for a careful 
review which led to substantial improvement of Part III on calculus, and 
especially to our wives for their encouragement and understanding through 
the seemingly endless job of writing 


John W. Bishir 
Donald W. Droves 



CONTENTS 


PART I l 

Finite Mathematics 

Chapter 1 

Set Theory 3 

1.1 Set Membership 3 

1.2 Subsets and Set Equality 6 

1.3 Complement, Union, and Intersection 9 

1.4 Algebra of Sets — Combining Union, Intersection, and 

Complement 16 

Supplementary Reading 25 

Chapter 2 

Symbolic Logic 26 

2.1 Statements and Their Truth Sets 26 

2.2 Equivalence and Implication 34 

2.3 Valid Arguments 40 

Supplementary Reading 47 

Chapter 3 

Relations 48 

3.1 Binary Relations 48 

3.2 Equivalence Relations 52 

3 3 Weak Ordering Relations and Preference Relations 57 

3.4 Partial and Simple Orderings 61 


CONTENTS / lx 



3.5 Graphic Representation of Order Relations 
Supplementary Reading 


67 

74 


Chapter 4 

Functions and Real Numbers 

4.1 Functions 

4.2 Absolute Value Function 

4 3 Bounded Sets of Real Numbers 
Supplementary Reading 

Chapter 5 

Sequences 

5 1 Basic Concepts 

5 2 Sequences of Statements- — Mathematical Induction 

5.3 Convergent Sequences of Real Numbers 

5.4 Convergence of Monotone Sequences 
Supplementary Reading 

Chapter 6 
Infinite Series 

6.1 Summation Notation 

6.2 Convergence of Infinite Series 

6 3 Bases for the Real Number System 
6 4 Techniques for Summing Infinite Series 
6 5 Double Sums and Double Series 
Supplementary Reading 

Chapter 7 

Combinatorial Analysis 

7.1 Two Basic Principles of Counting 

7.2 Counting Numbers of Samples 

7.3 Binomial Coefficients 
Supplementary Reading 


75 

75 

83 

88 

91 


92 

92 

97 

104 

112 

118 


119 

119 

125 

133 

138 

146 

151 


152 

152 

159 

167 

176 


PART II 

Matrices and Linear Algebra 

Chapter 8 

Addition and Multiplication of Matrices 

8.1 Addition of Matrices 

8.2 Multiplication of Matrices 

X / CONTENTS 



8.3 Basic Properties of Multiplication, Block Multiplication 197 

8.4 Applications Matrices and Directed Graphs and Input- 

Output Analysis 205 

Supplementary Reading 213 

Chapter 9 

Linear Equations and Linear Programming 214 

9.1 Linear Equations 214 

9.2 Elementary Operations on Matrices 219 

9.3 The Inverse of a Matrix 224 

9.4 Linear Inequalities and Convex Sets 231 

9.5 Linear Programming 237 

Supplementary Reading 246 

Chapter 10 

Characteristics Equations — Quadratic Forms 247 

10.1 Determinants 247 

10.2 Characteristic Roots and Characteristic Vectors 253 

10.3 Symmetric Matrices and Quadratic Forms 260 

Supplementary Reading 269 

PART III 

Calculus 27i 

Chapter 11 

Functions , Limits , and Continuity 272 

11.1 Real-Valued Functions 272 

11.2 Limits 276 

11.3 More About Limits 285 

11.4 Continuous Functions 288 

Supplementary Reading 293 

Chapter 12 

Differential Calculus 294 

12.1 The Derivative 294 

12.2 Derivatives of Sums, Products, and Quotients 301 

12.3 Mean Value Theorem 307 

12.4 Composite Functions and the Chain Rule 314 

12.5 Derivatives of Trigonometric Functions 319 

12.6 Inverse Functions — The Inverse Trigonometric Functions 325 

12.7 Logarithmic and Exponential Functions 333 

12 8 Applications of Differentiation — Optimization and Stability 339 

12.9 Higher-Order Derivatives Taylor’s Theorem 347 

Supplementary Reading 354 


CONTENTS / xl 



Chapter 13 


Difference Equations 

13 1 Concept of a Difference 
13 2 Basic Properties of Differences 
13 3 Applications 

13 4 The Simplest Difference Equations 

13 5 General Linear Difference Equations with 


Right Hind 


Side Zero 

13 6 Method of Complementary and Particular Solutions 
Supplementary Reading 


355 

355 

360 

366 

370 

375 

382 

389 


Chapter 14 
Integral Calculus 
14 1 Area 
14 2 The Integral 
14 3 Antiderivatives 

144 Fundamental Theorem of Calculus or Integration Made Easy 

14 5 Basic Properties of Integrals 

14 6 Applications 

14 7 Integration by Parts 

14 8 Integration by Substitution 

14.9 Rational Functions and Partial Fraction Methods 

14 10 Improper Integrals 

Supplementary Reading 


390 

390 

396 

402 

406 

411 

415 

421 

426 

433 

438 

442 


Chapter 15 

Differential Equations 
15 1 Separation of Variables 
15 2 Linear Differential Equations 
Supplementary Reading 

Chapter 16 
Power Series 

16 1 Taylor Senes 

16 2 Differentiation and Integration of Power Senes 
Supplementary Reading 

Chapter 17 

Functions of Several Variables 

17 1 Partial Derivatives 

11 1 !? a ! n Ru ’ e for functions of Several Variables 
1 7 3 Taylor s Formula— Optunization 

St SCT Restraints Lagrange MuUipUers 

Supplementary Reading 

/ CONTENTS 


443 

443 

447 

454 


455 

455 

463 

468 


469 

469 

477 

482 

492 

498 

506 


XII 



PART IV 

Probability 50 9 

Chapter 18 

Basic Concepts of Probability 510 

18.1 Sample Spaces 510 

18.2 Events 513 

18.3 The Axioms of Probability 516 

18.4 Basic Theorems 523 

18.5 Conditional Probability 531 

18.6 Applications of Conditional Probability 536 

18.7 Independent Events and Independent Trials 547 

Supplementary Reading 557 

Chapter 19 

Discrete Random Variables 558 

19.1 Random Variables and Probability Functions 558 

19.2 Random Variables Associated with Bernoulli Trials 564 

19.3 Expected Value of a Random Variable 574 

19.4 Jointly Distributed Random Variables 586 

19.5 Sums of Random Variables 596 

19.6 Generating Functions 607 

Supplementary Reading 614 

Chapter 20 

Markov Chains 615 

20.1 Basic Concepts 615 

20.2 Classification of the States 624 

20.3 Regular Chains 629 

20.4 Chains with Open Classes 634 

205 Applications in Sociology 641 

20.6 Applications in Psychology 646 

Supplementary Reading 650 

Chapter 21 

Continuous Time Processes and Continuous 

Random Variables 651 

21.1 The Poisson Process 651 

21.2 Continuous Time Random Processes 660 

215 Continuous Random Variables 669 

21.4 The Normal Distribution 680 

Supplementary Reading 689 

Answers to Selected Exercises 691 

Index 707 

CONTENTS / xlU 





PART 1 


Finite Mathematics 



SET THEORY 1 


1 1 SET MEMBERSHIP 

The language and techniques of set theory provide the basic tools for many 
branches of mathematics of significance to the behavioral and social scien- 
tist Sets also appear in mathematical models developed for the behavioral 
sciences Thus we begin our mathematical discussion with a look at sets 
A set is any collection of objects We may speak, for example, of the set 
of closing quotations on the New York Stock Exchange for September 23, 
or the set of possible dominance relations among a group of ten people, or 
the set of all solutions of the equation x 2 — 1 = 0, or the set of all purple 
cows Other terms, such as class or collection , are sometimes used as syn- 
onyms for set 

The objects in a set are called elements, or members, of the set We say that 
elements belong to the set The notation x E A is used to indicate that the 
object x is a member of the set A If, for example, P is the set of all psy- 
chotics, we might indicate that Mr X is psychotic by writing Jef 

One way of describing a set is to enclose, in braces, letters or numbers 
separated by commas to represent the members of the set Thus, the set S' 
of all solutions of the equation x 2 — 1 = 0 could be written as S ~ {1,-1} 

If R and L denote, respectively, “the rat turns right” and “the rat turns left,” 
then the set of possible choices made in a single run through a T-maze is 
C = {R, L } Similarly, {Adams, Jefferson, Monroe} denotes the set of U S 
Presidents who died on July 4 

PROBLEMS 

1 Write the following sets 

(a) The set of countries lying in both the Eastern and Western hemi- 
spheres 


11 SET MEMBERSHIP / 3 



(b) The set of Uvmg former Presents of the Umted States 

(c) The set of States having only one representatwe >n the House 
Representatives 

(d) The set of possible con.bmat.ons of corns which amount to dtp 

A set may also be described by stating a criterion which members of the 
sefmust satisfy For instance, the set S = {1, -1} may be denoted tn the 
ahernative form s - * 2 - 1 - 0} read -S . the set of all objects * 

having the property that x 2 - 1 = 0” In general, the notation A I - 
j* JU) means that A is the set of all objects x about which the proposi- 
tion p{x) is true It is common to use this method of representing a set when 
the members are not known exactly or are too numerous to list Thus we 
represent the set of all millionaires by M = {x x is a millionaire} and the set 
of all real numbers larger than 4 by G = {y y is a real number and y > 4} 
We have, for example, Onassis 6 M and 9 S G 


PROBLEMS 

2 Denote the following sets 

(a) The set of citizens of Canada 

(b) The set of stocks listed on the American Stock Exchange 

(c) The set of positive even integers 

Some sets, such as the set of all English words that begin with the letters 
qa, contain no elements Such sets are called empty sets, or are referred to as 
the mill set, and are denoted by the symbol 0 Other empty sets are the set of 
U S Presidents who have lived to be 100 years old and the set of all audible 
tones greater than 20,000 cps 


PROBLEMS 

3 Which of the following sets are empty 9 

(a) {0} 

(b) "The set of integer solutions of the equation * 2 - 3 = 0 

(c) {O' 


The members of a set may themselves be sets A soc.al club represents a 
ccrtam set of people, and it may be or interest to speak of the set of social 
‘ 1 f E, ,'. cn tov ™ Ho'"*'", w never allow a set to be a member of 
.tseir The following example, due to Bertrand Russell, shows why 

!ll seT.'n.o Sll,,p0Se Wc w,sh >» “".de ‘he collect, on of 

al sets into two smaller collect, ons one of wh.ch, M, is to contain all those 
sc s which ore members of themselves while the other, N, rs to contain those 
sets which ore m, members of themselves Obviously, every set ,s ass.gned 

« / SET THEORY 



either to M or to N but no set belongs to both M and N Quest, on-Where 
» , imbtr or ,I*IT Sul .bo man. lb« K 

other hand if N is placed in N, then N is a set which is not a member ot 
S and ’hence must be a member of M Either assumption is self-con- 

tf Wemay resolve this paradox by restricting our theory to those sets which 
are nomernbL of themselves Unfortunately, this solution of our troubles 

only leads to further complications For we n0 ^ , d thus N gN, 

become the collection of all sets But since IV ,s ttself a set (and thus « £ 
which ,s now illegal), we must also agree to consider the concept the 
set of all sets” as meaningless 

Example 1 has dealt briefly ^^b“ 

erly^pecified a^se?nlustbew<^^/6i*rf That is, for do 

must be able to decide whether lt does or doe^ ^ Qther boo ks, such 

“w P rr:orIu"d Luchins, listed m the Supplementary Reading at 
the end of this chapter 

PROBLEMS .. 

4 Denote each of the fol.owmg sets in » “W 

fat The set of living Presidents of the United States 

(a) the set 01 u s Y ork stock Exc h a nge 

(b) The set of stocks is e rnuaintances of George Washington 

(c) The set of all living persona a down a straight 

(d) The set of possible times required by a rat 

ce, ", of 26 year old male clirens of ,he United Slates who have 

( Sal y voted in three presidential elections 

(0 Z set of real numbers satisfying ,he equation x- + 3x 0 

r „„,„t in the plane Identify the following 
5 -:^^r^=^nr results 

(a) {(x.y) 3x-y = V 

(b) {(x,j) x+> = 5) 

(c) {(*,>) * + > =5 and 3* -y -J, 

(d) {(x,y) x + ,=5and2x + 2y- ^ 

m ft* x + y = 5 anti 2^ + 2> 1U> 

( ) {( ’ } M.— 1S presented to each of four people A , 

6 In an experiment the same P r ° ment ,s defined as the set of people 

B, C, and D The outcome of the expenmc 

who solve the problem , 

(a) How many possible outcomes are there 

(b) Write the set of possible outcomes 

H SET MEMBERSHIP / B 



7 Four paintings, one each by Renoir, Picasso, Klee, and Braque, arepr - 
' sented to a student who is asked to match paintings and names List th 
set of possible outcomes ( Hint The set contains 24 elements ) 


Problems 8 and 9 relate to Russell’s Paradox 

8 The village barber shaves all those males (and only those) who do not 
shave themselves Who shaves the barber’ (Designate by S the set ot 
males who shave themselves and by B the set of those who do not shave 
themselves Which set contains the village barber' 1 ) 


9. Upon being captured by cannibals a missionary is told that he may 
make one statement If the statement is true, he will be boiled, and if the 
statement is false, he will be roasted What should the cannibals do if the 
missionary says “I will be roasted”’ 


10. In an experiment on concept formation in small children, an experimenter 
wishes to vary the dimensions of color, size, and number of petals in a 
flower sketch If there are three colors, R, G, and Y, three sizes, S, M, and 
L x and three petal configurations, 4, 6, and 8, list the set of possible stim 
ulus configurations available (There are 27 ) 


II. To illustrate the logical inadequacy of displaying a few elements of a set 
and indicating by three dots that the pattern is to be continued, consider 
the set A of all numbers of the form 


-■> + (n - ))(» - 2)(n - 3) (1) 

where n is a positive integer Show that the first three elements, obtained 
when n = 5, 2, and 3, are 1, 4, and 9 so that one is tempted to write 

V<=- {1,4,9, > 

If A were written m this way on an intelligence test, we would not hesi- 
tate to write the next element as 16 Show, however, that the next de- 
ment, obtained by putting u « 4 in (1), is 22 rather than 16 

12 Write a defining relation for elements of a set so that its fourth element 
is, say, 73, while its first three elements are 1 , 4, and 9 


1.2 SUBSETS AND SET EQUALITY 

" d cry member oT set A is also a membpr nf cm n , , , , 

sutuc, of B WA is a subset of «, wc w rt ^ C R For n i < V 
set or states which he cast of the Mtsstsstpp, r7v«. Then 15 

A - {III, no, s, Indiana, Kentucky, North Catohna, Vtrgtma, West V.rgima} 
representing the collection of those states whose state . ,n , , 

IpKSeSSSSH 


* / SET THEORY 



Our definition of subset may be rephrased by saying that A is a subset of 
B if there are no elements of A which are not contained in B Of course, if 
A is an empty set 0 this condition is automatically satisfied and it follows 
that 0, having no members, must be a subset of every set 
In principle, a subset A of a set B is completely determined when we have 
decided for each element of B whether or not it is to be included in A When 
B has only a few elements, the number of decisions required is small and we 
may easily list all subsets of B Thus the subsets of the two-element set 
B = {x,y} are B itself, obtained by including both elements, the sets {x} 
and {y} obtained by including one element and excluding the other, an > 
obtained by excluding both elements Similarly, the subsets of — {x, y, z) 
are 0, containing no elements, {*}, {y}, and { z } each containing one e emen , 
the two-element sets {x,y}, {x, z}, and {y, z], and B itself 


PROBLEMS 

1. (a) List the subsets of B = {*, y, z, How many are there? 
(b) If B contains n elements, how many subsets are there? 


As the number n of elements m B increases, it soon bec °™ e ;™ p ;“ t ’ C c ^“ 
list all subsets For each element of B, there are two possi e c o 
or exclude) and thus there are 2 X 2 X X 2 = 2» "which may 
be formed using elements of B For n - 2, 3, or J e ha = I024 subsets> 
easy to list all the subsets But when n = 10, there are z 
while if n = 20, the number of subsets exceeds one million 


PROBLEMS 


2 If A contains 40 elements, how many 


subsets does A have 7 


Two sets are said to be equal if each is a subset of ^ talns a n 

two sets are equal if they contain the same elements, or 
element not also contained in the other ]lsting 0 r elements of 

For simplicity we shall adopt the convention y n agree (hat listing 

a set contains no duplications Stated another wi ay, the sets A = 

the same element twice does not change the se element contained 

0 , 2, 1, 3} and B = {1, 2, 3} are considered equal (Any elemeni 

ln one of the sets is also contained in the other ) having a single 

As a final point concerning set equality we no e ^ member if, for 
niember is not to be considered as being iden 1 L,ons club, then L 

example, L is the set of 95 people belonging o - j oca j Lions clubs 

a nd {£} are two different sets For {L} is t c ° get of peo ple belonging 
a nd in our example has but one element, whi e must differ 

t0 the local Lions club and contains 95 elements 
since they do not contain the same elements 

_ AW n SET EQUALITY / 7 

! 2 SUBSETS AND SET 



problems 

3 . Determine whether A ~ B or A ^ B 

(a) A * {2, 4, 6} and B = {4, 6, 2} 

(b) A — {1,2,3}, B — {Mercury, Venus, Earth} 

(c) A ~ {T T is a plane equilateral triangle} 

B « {U. U is a plane equiangular triangle} 

4. Which of the following are true*> 

(00-0 (d)OG0 

5. Let A - (l. 2 . 3 ) WentiFy the sets B such that {1} Cfi.BC/), and 
B 9 ^ A 

6. Which of the following are correct and why'* 

W Wi e {U)'i (b) M e 

(0 (xS e (x, (x5! (<*) W £ l*. W> 

(e) a - {«} (0 w,b) " <*.«} 

7. Seven objects are to be presented in groups of three in such a manner 
that each object appears with each other object once and only once 
during the sequence of presentations How many three-element subsets 
must be presented in order to meet this restriction 9 

8. Ivan, Sean, Juan, Maria, and Betty line up for a picture Naturally, boys 
and girls alternate List the set of all possibilities Then list the following 
subsets 

(a) The set in which Ivan is m the middle 

(b) The set in which a boy is at each end 

(c) The set in which Ivan is between Maria and Betty 

(d) The set in which Maria is at the left end of the line 

9. The Trcdmore Shoe Company has a top management committee con- 
sisting or four vice-presidents and the president The committee reaches 
its decision by simple majority vote 

(al List the set of all possible outcomes of the vote on a given motion, 
assuming that no member abstains from voting 

(b) List the subset of winning outcomes, that is, those outcomes which 
result in the motion being passed 

List the subset of winning outcomes if 

(c) the president has veto power 

(d) the president has two votes 

(e) the president votes only m the case of a tied vote 

VO. fa) Prose that every set is a subset or itself Contrast this with the com- 
ments preceding Russell's Paradox (Example 1 of Scctipn 1 1) 
lb) Prove that 0 has only one subset What is if> 


8 / SET THEORY 



11. The set A is called a proper subset of B if A Z B and A A B 

(a) List the proper subsets of B = \[,R) 

(b) Show that 0 has no proper subsets 

(c) Show that A = B if and only if A is a subset of B but not a proper 
subset 


12 (a) If A c B, is it necessarily true that BSA’’ 

(b) If A e B, does it follow that Be A’’ 

13 (a) If A C B and B c C, does it follow that A £ C 
(b) Same question with <Z replaced by £ 

14 (a) If Z e A and A £ B, does it necessarily follow that Z £ 

(b) If Z e B and A £ B, does it necessarily follow that Z E A 1 

15 Four people A,B,C, and D are waiting for an interview The 

may elect to see them singly or m groups of 1 ** ■ “ ** ™ 

collection of possible sets of interviewees — thei^ewer may 
in the first interview session Compare with Problem 1(a) 

16 Ina committee of four individuals A, B, C, and B each hav ng one vote 
a majority is a set of three or more committeemen Thus 

possible majorities is a set having sets as e e e 

(a) List the five elements of M 

r a n although a set, is not a subset, 

(b) Argue that the majority {A, B , C) , aitnoug 

but rather an element, of M 

(c) Is {{A, B , C}} an element of M*> a subset of M 


1 3 COMPLEMENT, UNION, AND INTERSECTION 

In building workable models to be used S”°ontained in the model, 
we are generally not interested in all possible J ^ j n econo mics, 

but rather only in those which are containe in of commodities with 

for instance, i.mitc the discussion to some s 


ily in those which are : contained in .sona^ of commo dities with 
maiauce, one often limits the discussio contrary is made, 

the understanding that unless an explicit statement to the contrary 

nil statements refer to this set nnrticular discussion is 

The set of those objects which are of in ere ( j, at discussion The 

called the domain of discourse or the unmersa J of applicability of a 

domain of ordinary discourse is often vague an f or dj Thus, when it is 

statement are more often assumed than precise y smging group," 

remarked that tW the Beatles arc nr _ r w..i. 


more often assumed 1 than mens y ^ sl nging gro up,' 
•emarKed that “Everyone knows that the Beat ^ pnnc , paliy ofWest- 
11 is to be understood that “everyone refer P , 0 in fants, Bhantu 

ern cultures, who know something of pop music, latitude His task 

... tl. ,, iiMu/p.ver. is allowed no suen mi , f .n. 


es, who know something of pop " n:SIC ’ u suc j ) latitude His task 

Inbesmen, etc The scientist, however, is allowe thc possibility of mis- 
15 to express information in such a way as to mi necessary to limit his 

“"demanding In order to achieve this goal he i wlt i, 3 domain 

choice of language and symbols In particular, he must oeg 

ND ,ntersection / 0 


13 COMPLEMENT. UNION. A 



of discourse definite enough to allow an objective evaluat.cn as to what does 

- T :si=r ss “ — - - — » 

consider the problem of “solving” the equation 2x - x - 3 - u 
ferent results are obtained depending on whether we mean integer solution 
(-1), real solutions {-1,?}, or posit.ve solutions {*} We must speedy 
beforehand those objects to whteh the term solution is to be applied That is 
to say, we must specify the universal set of objects which may be substituted 
for x Once a universal set V has been specified the term set will always refer 
to a subset of 17 

We now turn to a consideration of how sets may be manipulated to pro- 
duce other sets Throughout we shall assume that a universal set U has been 


specified and that all sets mentioned are subsets of U 
jf a is any subset of U, the collection of those elements of JJ which are not 
contained in A is called the complement of A (sometimes, the term “the com- 
plement of A relative to IT ’ is used) and is denoted by A' Thus, if U~ 
{a,b,c,d, e} and A = {a,c}, then A' = {b y d, e) Similarly, the comple- 
ment of fi = {a,6,c,e} is B' = {d} If U is the set of all integers and E 
is the set of even integers, then £' is the set of all odd integers The comple- 
ment of the set of negative integers is the set of integers which are positive 
or zero 


PROBLEMS 

1 The universal set is U = (Washington, Adams, Jefferson, Monroe) 
What is the complement of the set containing 

(a) The names of those Presidents who were Federalists 0 

(b) The names which begin with A' 1 

(c) The names of those Presidents who served after 1834 9 

(d) Those names containing nine letters’’ 

Since m our discussion all elements will be contained m U, it follows that 
the complement of V is the empty set 0 In turn, the complement of 0 is XJ 
U is apparent from the definition that complements always come in pairs 
bohcally * * ** compkment of A > then A 15 the complement of A 1 Sym- 
W = A 


A convenient way of indicating complements is shown in Figure 1, called 
I ' cla " 8l= re P r «ents the universal set V, the elements of 
U being contained within its boundaries Elements of A arc contained within 

If Tonri fi " lS ° f A ” " C m ,he shaded re S Ion » in (/but outside ^ 

a nd B ,s Klferi it, any ‘r' c0 "' c "° n ° r elements common to both A 

and B called the intersection of A and B and is denoted by A n B The col- 

eetion of those elements which belong Cher to A or to B or to both is called 
the umon of A and B and is denoted by A u B 


to / SET THEORY 



If U = {a, b, c,d,c}, A = {a, c}, and £ = W 4. th ™ 

{«> and A UB= {a, b,c,d,e) = V If C= {6,4, then ^ n C = 0, 

W ll C evl = d^mofF,gu^ 2, the intersection of , I-*™*— J 
by the heavily shaded region common to both sets while the u 
of the entire shaded area 


FIGURE 1 


FIGURE 2 



2. If U = {1,2, 3, 4, 5, 6, 7}, A = (1,4), ^ {2, 4, 7,, and C 

what is /I O # 7 ^ U 5 7 >4 O C 7 


{ 2 , 6 }, 


Example 1 Luce and Rogow* use the ^ ns “ a ^ t f f S ^gre°^men who voted 

or the congressional power structure IfLi ,s the set of minority 

in favor of a bill sponsored by the majori y p> y, denote 

congressmen and V is the set of all congressmen, how might w 

(a) the set of defectors from the minority party'' 

(b) the set of loyal minority party members'' 



in Figure 3 are labeled four regions ^ 

for any element x (or congressman) in 

r* ~ X4 Hrre x G Fand X <= M 

Region 

Region R 2 = F r\ M Here x ^ ^ Iwr 

•La«.R D , and Rogow, A 
distribution for a Stable Two-Party System, V 

n ,o N .a ND .«”" sect,0N ' " 


1 3 COMPLEMENT, u 




Region it, = F'nAf Here x6 f and tGM. 

Region Ri = F'nM' Here \6 T and xSM 

Since the defectors are those minority congressmen who toted «.th the 
majority party, the set of defecting mmotity congressmen is T n . , P 

"learemt^d 1 not ,n sc, Fbu, rather ,n the complement of Fa™ 1 
specifically in that part of F' which ts conuincd in M Tims M n T , rtpr 
sented by region it 3 , is the set of minority members who did not sole wl£ 
the majority party 


The operations of union and intersection may be extended to any nuni r 
of sets We define the intersection of any collection of sets as the set con* 
taming those elements which belong to all the sets The union of a collection 
of sets is the set of those elements which belong to at least one of the sets m 
the collection 


Example 2 Stoyva and Kamiya* hate conducted studies in which they 
attempt to relate dreaming to the subject’s verbal reports or dreams and to 
the rapid eye movement of the subject while sleeping Let U denote the <cl of 
subjects participating in the experiment, D the set of subjects in which dream 
activity occurs, V the set of subjects who, upon awakening, report hiving 
dreamed, and £the set of subjects on whom rapid eye movement is measured 
The nonoverlapping regions in Figure 4(a) represent the eight logically pos* 
sible categories into which a subject may be placed relative to the sets D, V, 
and E Region R u the intersection D n V n E of the three sets, represents 
the set of subjects in whom dreaming occurs, rapid eye movement is mca* 
sured, and who, upon awakening, report dream activity Similarly, region 
R 4 = D n V' n E' represents the set of subjects in which dreaming occurs 
which is not subsequently reported, and in which no rapid eye movement is 
detected The set of those who make verbal reports is the union V = R\ U 
R 2 U £5 Ui? 6 - while those who dream and in whom rapid eye movement 
is measured is the set D n E ~ R x u 

PROBLEMS 

3 Proceeding as in Example 2, express each of the regions R-,, R 3 , £ 5 , Re, 
R-> and * 6 m terms of the scls D, V, and £ In each case give a verbal 
description of the resulting combination 


Example 3 Suppose that in Example 2 the following 
on a group of 100 persons tested 


statistics are reported 


55 dreamed (D) 

50 reported dreams (K) 




12 / SET THEO RY 



40 exhibited rapid eye movement (£) 
23 did both D and V 
15 did both D and E 
19 did both V and E 
7 did all three, D, V, and E 


How many persons had dream activity which was neither reported nor in- 
dicated by eye movement - * That is, how many pens the number of 

To answer this sort of question we use th results are shown in 

people in each of the eight regions of Figure 4 a) The are 

Figure 4(b) The trick is to work backward, usin S “ a “I Dn y n E 
Thus the last item indicates that there are s ® ve ” p Smce seven a re already 
Next the 19 persons in V O E go either in 1 s complete the 

placed m Ru this leaves 12 for R s Continuing m th. i way 
enumeration one category at a time In particular, we find P ^ 
region 




PR ° BLEMS ,, the statement ‘‘The only stales c<- 

4. What universal set is understoo > d d Hawaii - "’ 

tend, ng farther south than ^ “ ^1 or ^b, nation of re 6 ,^ 

5 In the Venn diagram of Figure w 

does each of the following sets rep , 

(b ) £ v v 

(n)D ' ur (d) £n(on <0 

® ,o ( E n»)vv 

(c) (£ n Dy set U Arrange the follow- 

5 The sets A, B, and C arc * ub5US ° “ h 5C , m Ihc sequence is a i sub* 
mg sets in sequential order so (ZIUQ, W n B) n • 

of the next set A U B. U. A fl B, l . 

(/i uJiuc, (v, «nd 

7. Sets A and B arc called disjoint ,f A ° B n 0 n (fl n O - 9 

(a) Prove lhat if A and B are disjoint, l _ p or A n C - 0 

( b) S ho„H4n/-nCe^«-'« ,! ' ,n 

or one - d 

„ CMrttMt^-,— Ct, °" 





; «- TxnrTz-r n r n r ' 0 

tmn sets X and Z if and only if AO 1 flZ 

(a) Give a verbal description ot betweenness 

(b) Show that y is between X and 2 if and only if X ft Z C 1 c A U 

(c) Prove that it X E Y S Z, then Y is between JT and Z 

(d) Prove that tl X is between II' and Z, and Y is between X and Z, then 
V is between W and Z 


9 Let V denote the set of possible outcomes or an experiment in which 
three subjects A, B, and C each answer a simple yes-no question Let l 
denote the subset of U containing those outcomes in which subject A 
responds yes, f'the set of outcomes in which all three subjects respond 
yes, and l^the set of outcomes in which the number of no*s exceeds the 
number of yes’s 

(a) List the elements of V (There are eight ) 

(b) Write the following subsets of U T\ V, T U V,T’ C I (7*VJ ^0 » 

(T 0 V)\ V U V, and (T ft W") U V 


10 Which of the following are correct and which incorrect 

(a) (A U B) fl C = A U (B n C') 

(b) a n b n c - a n b n (c u b) 

a u b u c = a u cb a a' n b) u cc n (a n cy) 

(d) A U B = A \J (B a A’ H B) 

(e) (A n b n o c {a n b> u (b n Q u (c n A) 

(0 (A U B) n A’ = B 

(g) (a n b n Q e a u b 

(h) (ituBuc)' =*/i'nB'nC' 

(0 mub)’ nc = a* r\ B’ c\c 

d) wuB)'nc = (/i'nc)u (B' n o 

11 Let U be the set of all employees of the Behavioral Research, Inc , and 
let A\, Ai, and A 3 denote the sets of newly hired employees who score, 
respectively, below 50, between 50 and 150, and above 150 on the initial 
placement test Further, let J 1, Jj, and J3 denote the sets of employees in 
respective job classifications I, II, and III Translate each of the follow- 
ing statements into set notation 

(a) AU newly hired employees who score below 50 are placed in job 
classification I 

(b) No newly hired employee who scores above 150 is placed in either 
job classification I or II 

(c) All newly hired employees who score between 50 and 150 are placed 
in either job classification II or III 

(d) Some newly hired employees who score over 150 are placed m job 

classification III J 


•Resile, F , Psychology of Judgment 
Inc, New York, 1961) p 46 


and Choice A Theoretical Essay (John Wiley 


& Sons 


14 / SET THEORY 



(e) No employee of the company is assigned multiple job classifications, 
there being only three classifications 

12. For each of the following draw a Venn diagram to show that the pairs of 

sets are represented by the same region 

, , , . . (b) A U A' and U 

% AU (Bn O and (AUB)H(AU O ® W U n 

13. A convention of insurance agents was ^ J 8 ° ^240 who sell 

life insurance, 230 who sell fire and casualty «-««££■ and fire and 
auto insurance Of these agents, 14 and fire an( j casualty in- 

casualty insurance, 120 sell both h e insurance and 60 sell all 

surance, 110 sell both hfe insurance an a aforementioned 

three All agents m attendance sell at least one o 

types of insurance 

(a) How many agents attended the convenuon^ ^ ^ ^ 

(b) How many agents sell auto insuran 
insurance 9 

(c) How many agents sell only hfe insurance , 

(d) How many agents specialize in only one type o^ ^ ^ ^ 

14. According to National Saf ® ty ,“ooo occurred in places classi- 

motor vehicle fatalities in 1965 O > > . hours> and 17,500 De- 
fied as urban, 23,200 occurred during day fch c , asslficat , 0 n re- 
curred at night in places classifie as , rura j places, 15,900 

vealed that 3100 pedestrian fatalities o ^ nlght> 2 550 pedestrian 

nonpedestrian fatalities occurred in rur daylight hours, and 1600 

fatalities occurred in urban places 

pedestrian fatalities occurred at nigh accl dents in rural places 

(a) What proportion of the victims of daylight acctdent 

were pedestrians 9 h t accidents in urban places 

(b) What proportion of the victims of dayl.gn 

were nonpedestrian ’ occurred during 

(c) What proportion of the total fatalities m rural 

the daylight hours’ fatalities in rural areas 

(d) What proportion of the total pedestrian fa.aht, 

occurred at night 9 Jt was found 

15. In a survey of the voting habits of l®* ^Jon, 595 voted in the i960 
that 605 voted in the 1956 president* 1 < 1< ^ presldentia! election Of 
presidential election, and 675 vote i elections, 385 voted in o 

these, 415 voted in both the 1964 960* ^ J956 and 1964 elec- 

the 1956 and 1960 elections, 395 vole elec „ 0 ns 

tions, and 65 did not vote in any of the tnr 

(a) How many voted in all three election ^ 

(b) How many voted in exactly one elec 

(c) How many voted in two or more election ^ 

’Aentmi Facli (National Safety Council, Chicago, 111 ). 1966 


N , ON. AMO- NT5BSECT.cn / « 


1 3 COMPLEMENT. U 



*SS3K =iSS3SSt 

sets that correspond to the following statements 

(a) The model outcomes dupheale exactly those in the real world 

(b) The experimental coitdittons perfectly duplicate the envtronmenl 
assumed in the model 

(c) The experimental outcomes are common to both the model and the 
real world outcomes 

(d) The outcomes predicted by the model are found in the real world but 
do not duplicate the experimental outcomes 

(e) There are model outcomes that arc not Iound in the experimental 
situation or the real world 


17 Show that 

(a) U = 0 

(b) if AQB, then B'QA’ 

18 The collection of elements which belong to A or to B, but not to both, is 
called the symmetric difference^ of the sets A and B and is denoted by 
A A B Show that A A B = {A' H B) U (B' D A) 

19 The set of elements contained in A but not in B is sometimes called the 
set difference of A and B and denoted by A — B 

(a) Show that A — B - A n B' 

(b) Show that the complement of A may be written as V — A 

(c) Show that A A B =* {A U B) - {A D B) 

(d) Show that A A B => (A ~~ B) U (B — A) This explains why A A B 
is called the symmetric difference 

(t) What sets in Example 1 may he written in the form A ~ B 0 


1 d ALGEBRA OF SETS— COMBINING UNION. 
INTERSECTION. AND COMPLEMENT 


In this section we consider various relations which exist among the set opera- 
tions union, intersection, and complement In so doing, we will discover that 
there are close analogies between the operations of union and intersection 
for sets and the operations of addition and multiplication for numbers The 
basic results are summarized in the following theorem 


JJ „ * a " d “V L , * Nature the Experimental Laboratory, and the Credi- 
bility of Hypotheses, Behaiioral Science 9, 239-246 (1964) y in e0 

tFor an example of its use see Galanter, E “An Axiomatic and Experimental Studv of 
Sensory Order and Measuie, Psychological Review 63, 218-227 (1956) y 


16 / SET THEDRV 



Theorem 1 Let A, B , and C be subsets of a universal set V Then the 
following relationships hold 


Identity laws 

A U 0 = A 

A\)U — V 

A n0 = 0 

Anu = A 

Complement laws 

A U A' = V 
(AJ = A 

A n A’ = D 

ldempotent laws 

A u A = A 

A nA = A 

Commutative laws 

AvB = BuA 

A nB= BnA 

Associative laws 
(A U B) U C = A U (B U C) 

(A n B) n C = A n (£ n C) 

Distributive laws 

A u (B n C) = (A U B) n (A U C) 

A n (B u c) = (A n B) u (A n c) 


De Morgan's laws 

C A u By — A' n B' 

(A n B)’ = A’ U B’ ► 


Before proving this theorem, let us note that mo. t . of these ^ 
operations are familiar from our knowledge o t e a ge number 

,ed in high school Since addmg 

larly, S anTdentitiTnumbe^for mulupliratmit 

ber by unity yields that number as a pr ( . respec t 

identity laws for sets shows that the 5 “P* “ * for lnter section Because 

‘ofZrr,.^ caned the « ~ « 

of^tand B and A n B the «'"wUh the commutative and assoc, at.vc 
The analogy may be carried further are commutatlV e, that is, 

laws Thus, addition and multiplication o 

a + b = b + a and a b = b a 

, , , , , , h . nr der in which numbers ore added 

for any numbers a and b Moreover, the 
or multiplied is irrelevant That is, the associa 

a + (6 + c) = (a + b) + c nnd a (b c) - 
hold for any three numbers «. 6, and c (Parentheses md.cale .hose operahon 
which are to be performed first ) operations on numbers 

A distinction between °fi craU0 " S ° n arc tVl0 such laws for sets, one of 
arises with the distributee laws Ther ^ ot j lcr t j 1 stribulcs mlcr- 

which distributes union o\cr intersection w distributee law, which 

section over union For numbers, there is only one 


re algebra 


OF SETS / 17 



Attributes multiplication over addition That is, 

a (i + c) = a b + a c 

for any three numbers a, b , and c It « not true that addition distributes o 
multiplication since, in general, we find 

a + (6 c)?£ (a ^b) (a + c ) 

For example, 3 + (4 2) = 11, while (3 + 4) (3 + 2) = 35 

A further distinction between numbers and sets arises from the fact t.„ 
numbers do not satisfy idempotent laws For instance, if A is a number, tl^ 

A 4- A = 2A, while if A is a set, A U A = A 

PROBLEMS 

1 Let U - (a, by c, d e, f, , A - { aj }, B = {b,c t f t g) t and C 
(a b, e g) Verify the statements of Theorem 1 

proof OF theorem 1 We shall prove only the first distributive U 
leaving the remainder of Theorem 1 for the exercises Let us first consn 
the procedure we must follow By definition two sets L and R are equa 
and only if each is a subset of the other Our procedure for proving equa* 
is thus clearly indicated We first choose an arbitrary element rg X, £ 
show that x e R Since the choice of x was arbitrary we conclude that ev ' 1 
member of L is also a member of R and hence that L c R We then reve 
the argument by choosing an arbitrary element x £ R, show that x e 
and deduce that R CL From these two results we conclude L = R ^ 

In order to prove the first distributive law, then, we must show both 

A u (5 n C) c {A U B) n (A u C) 
and 

(A U B) n (A U C) c A U {B n C) 

To prove (2), we choose an arbitrary element x e A u (5 n C), the tc ’ 
shaded area shown in Figure 5(a) There are two cases to consider 

case (i) x e A Then certainly xeA\jB and xe^uC and it folic 
that x e (A U B) n (A U C) 

case (ii) xsBnC Then x e B and x e C Hence x <= A U B and . 

A U C so again x e (A U B) n (A U C) 

Having considered all possible cases, we have established that A vj (B n C 
(A U B) n {A U C) 

To complete the proof, we choose an arbitrary dement x in (-4 u 5 ^ 
(A U Q , the cross hatched area shown m Figure 5(b) Then x e A U B , 


18 / SET THEORY 




x G A U C Again there are two cases 
case ( 1 ) xeA If so, then x must also be an element of A U (B rt C) 

case (it) X e A Then in order to have x e A U B and x e A u C, we 
must have x s B and x e C Hence xEfinC and again it fol- 
lows that x e A U (B fl C) 

The proof of (3) and of the first distributive law, is now complete ► 

PROBLEMS 

2 Draw Venn diagrams illustrating the other statements of Theorem 1 


In the Proof of Theorem 1 we repeatedly used the following result, the 
Proof of which is left as an exercise (see Figure 6) 



ADB 


Theorem 2 For any two sets A and B, 

A nJ QA £A U B 

and y 

AnBQBQAUB 

,4 ALGEBRA OF SETS / 19 





* have developed a simulation model of the labor 
’ is classified according to each of the following 


Example 1 Orcutt a aJ 
force in which each person 


categories 

place of Residence Age 

R x Rural, farm A i 25-29 

r 2 Rural, nonfarm A 2 30-34 

J? 3 City 2500-25,000 A 3 35-44 


R i City 25,000-100000 A 4 45-64 


Education 

E i 5-6 years elementary school 
E 2 7-8 ye irs elementary school 
E 3 1-3 years high school 
£ 4 4 years high school 

£5 1-3 years college 
£ c 4 years college 


Let us take the universal set U to be the set of all persons in the labor force, 
and let us denote by L the set of all employed persons and by R 1, R 2 , , 

£ 0 the sets of persons in the categories listed above 
Assuming that no person with 4 or more years of college is unemployed, 
then the set of unemployed persons other than those who have neither 4 or 
more years of college nor only 5 or 6 years of grade school is identical with 
the set of unemployed persons having only 5 or 6 years of grade school 
Translated into set notation we have 


V n {Eh n £1) = V n [(££)' u (£1)'] 

= £' n (Ea U Ei) 

= {u n £ 6 ) u {V n Ei) 
= 0 u {V n £ 1 ) 

= v n £1 


[De Morgans Laws] 
[Complement Laws] 
[Distributive Law] 
[Assumption] 
[Identity Laws] 


► 


An argument similar to that given in the Proof of Theorem 1 serves to 
establish a more general form or the distributive laws for sets Recall that 
by the intersection of a collection C of sets A, we mean the set of those ele- 
ments which belong to all the sets A, We denote this set by f~| A, The 

union of the collection is the set of elements belong.ng to at least one of the 
sets A , , and is denoted by 1J A, 


Theorem 3 The Generalized Distributive Laws Let C be 
collection of sets A„ and £ be any set Then 


an arbitrary 


EuUIa) = Pi (EuA.) 

C C 


( 4 ) 


•Orcutt G H Greenberger M Korbel 3 and R, vim A M Mieroanahsts of Socio- 
196T' C “ ° n y CHatpet and Row ™ ls h«s Inc New Yolk 


20 / SET THEORY 



and 

£l"l (U At) = U (.EnAi) (5) 

c c 

PROOF The proof of (4) closely parallels those of (2) and (3) above. If 
x G E U ( fH)> then x S E or x is a member of every A x . If x G E, then by 
c 

Theorem 2, x e E U A{ for every A t , and hence ^ e f| (£ U A x ). If x is a 

c 

member of every A x , then x is a member of every set E\J A x and again 

x e p| (E U A t ). We conclude that E u (D A x ) is a subset of fl (E U A x ). 
c c c 

Conversely, if x G Q (E U A x ), then either x e E or x& E. If x e E, 
c 

Theorem 2 guarantees that xGE U (fl /!,). If x £ then x can be a mem- 

c 

ber of fl iM U A) only by being a member of every A x . Hence x(=f\A t and 
c c 

x G E U (fl A t ). We conclude that Q (E U A x ) is a subset of E U (fl A), 
c c c 

completing the proof of (4). The proof of (5) is left as an exercise (Prob- 
lem 14). ► 

Example 2 In the notation of Example 1, the set of persons between ages 
25 and 44is / 4 1 u^2U^3. Hence the set of these persons who are residents 
of rural nonfarm areas is R 2 fl (A x u A 2 U ^3). Using the generalized dis- 
tributive law (5) this may be rewritten 

(R 2 nAd u (rt 2 r\A 2 ) u (R 2 c\A 3 ) ► 

The De Morgan's laws may be extended to any collection of sets in a 
manner similar to the distributive law. 

Theorem 4 Generalized De Morgan's Laws Let C be an arbitrary collec- 
tion of sets A x . Then 

(u A,y = n a\ (6) 

c c 

and 

(fU)' = lU < 7 > 

c c 

In words, this says that the complement of a “union” is the “intersection" 
of the complements, while the complement of an “intersection” is the 
“union” of the complements. 

PROOF We shall prove (6), leaving (7) as an exercise (Problem 14). Choose 
an arbitrary element x in ((J X,)'. Then x&(J A, which means that x is not 
c c 

a member of any A,. That is, x lies outside every A„ and thus x £ fl A\ 
(sec Figure 7). c 


1.4 ALGEBRA OF SETS / 21 



FIGURE 7 Venn diagram to 


illustrate Formula (6) 



Conversely, if x 6 f| A'. then x 1,es outs,de eVery Au fr ° m WhlCh 11 fol ‘ 
lows that x is not a member of at least one A t , or that x G (L) A) 
completes the proof 


Example 3 In Example 1, any person having less than 5 years of school- 
ing is by definition not included in any of the sets E i, , E & That is, any 
such person is a member of 

(Ei u e 2 u u E 6 y = E\ n e' 2 n n f 6 ► 


PROBLEMS 

3 A sociologist is studying the membership structure of cliques in high 
school Let U be the set of all students and let the following be subsets 
of U 

M = the set of male students 
F - the set of female students 
D = the set of students who date 
C - the set of students who plan on going to college 
O = the set of students who are members of the “out group” as defined 
in this study 

/ = the set of students who are members of the “in group” as defined 
in this study 

Translate each of the following empirical observations into either an 
equation or an inequality using set notation and the symbols - , 0, 

n, U, c 

(a) Some male students do not plan on attending college 

(b) Not all the female students date 

lt> Those girls who date or who plan on attending college are members 
of the “in group " 

(d) No student is a member of both the “m group” and the “out group ” 


22 / SET THEORY 



(e) Neither the male students who do not plan on attending college nor 
the girls who do not date are members of the “in group ” 

(0 Some boys who do not date but who plan on attending college are 
members of the “in group ” 

(g) All students are members of either the “in group” or the “out group ” 

(h) Given this information, what conclusions can we draw concerning 
the membership of the “out group” 9 Express these conclusions in 
set notation 


4. Prove that the following laws hold for any subsets A , B , and C of a 
universal set U In each case draw an appropriate Venn diagram 

(a) A U 0 = A (b) A n 0 = 0 

(c) (A n B) n c = a n (b n c> (d) (A' n By « a \jb 

(e) (/tnfi)fl(4n B ') = 0 

(0 J r(Buc) = (An b)\j (An c) 

(g) A nBQAQAU B 

5. Simplify each of the following by drawing an appropriate Venn diagram 

(a) A n (A n B) (b) A n ((A n B') U ( A ' n B)\ 

(c) (A U B) n (A U B') (d) [(A D B') U(5fl C')J H (A n CO 


6. Corcoran* presented cards containing symbols which differed in shape, 
color, and size to a group of subjects in order to assess the effects of 
task complexity and practice on performance after sleep loss The sym- 
bols were either squares or circles, black or white, and large or small 
Let S, C, B, W, L, and N denote the respective sets of cards containing 
square, circular, black, white, large, and small symbols Describe the 
attributes of the following sets of cards 


(a) S fU 

(c) (c n w) u (c n b) 
(e) s — (Ln B) 

(g) (b n wy n (c u l) 


(b) WnN' 

(d) (b n W) u s 
(0 (s n b) u (s n l) 
(h) c n (L ~ W) 


7. Simplify each of the following 

(a) A A U (b) A A 0 

(c) AAA' (d) A A A 

(The symbol A is defined in Problem 18, Section 13) 


8 Construct an example to show that, in general, 


A A (B n Q ^ (A A B) n (A A C) 

That is, there are sets A, B, and C for which the relation does not hold 
S>. Prove that [A n (B U C)] U [B n C] = [A n B] U [(A U B) n C] 

[Hint The analogous statement for numbers is a(b c) + be *=* 
ab 4 - (a + b)c ] 

•Corcoran, D W J , “The Influence of Task Complexity and Practice on Performance 
After Loss of Sleep,*' Journal of Applied Psychology 48, 339-343 (1964) 


14 ALGEBRA OF SETS / 23 



,0. Prove that U U B) n W U O - A U KV U <5 n ^ U C* 0 O 

11 . It is true for numbers that ab + (a + b)c - (<■ + <0* + « c State and 

prove an analogous result for sets 

12 Prove the remaining statements m Theorem 1 

13. Prove Theorem 2 

14. Prove Equations (5) and (7) 

15 Show by a suitable example that the following statements are true - 

(a) Subtraction is not associative 

(b) Subtraction is not commutative 

(c) Division is not associative 

(d) Division is not commutative 

16. Show by choosing suitable examples that 

(a) the operation of taking differences of sets is not associative 

(b) set differencing is not commutative 

(c) the operation of symmetric difference is not distributive over union 

(d) symmetric difference is not distributive over intersection 

(The difference of two sets is defined in Problem 19, Section 13) 

17. As an example of noncommutative mathematical operations, let M(x) 
indicate that the number x is to be multiplied by 3 and A(x) that 2 is to 
be added to x Show that A[M(x)] is not the same as M[A(x)\ 

18. Let a° b be defined as the average of two numbers, that is, (a + b)/2 
Show that the operation ° is commutative but not associative 

19. Suppose that an applicant is evaluated by two separate interviewers If 
the applicant fails either or both of the evaluations, he fails to qualify 
for employment The selection operation * defined on the set (0, 1} 
where 0 denotes “failure” and 1 denotes “success” may be described by 
the following table 

jevaluation B 
*01 


evaluation A u u 0 

1 1 0 1 

Thus 0 • 0 - 0 0 • 1 - °, 1 . 0 = 0. and 1 . 1 - I Determme whether 
the properties of commutativity and associativity hold for the operation * 

20. The operation A defined on the set {0, 1) may be described as follows 


A 1 0 1 
0 10 
1 I 0 1 


mat is, o A o = 1, 0 A 1 = 0, 1 A 0 = 0, and 
operation A for commutativity and associativity 


24 / SET THEORY 



SUPPLEMENTARY READING 


Goldberg, S., Probability, An Introduction (Prentice-Hall, Inc., Englewood ClifFs, 
N. J., 1960), Chapter 1. 

Luchins, A. S., and Luchins, E. H., Logical Foundations of Mathematics for Be- 
havioral Scientists (Holt, Rinehart and Winston, Inc., New York, 3965). 

Suppes, P., Introduction to Logic (D. Van Nostrand Company, Princeton, N. J., 
1957), Chapter 9. 

Wilder, R. L., Introduction to the Foundations of Mathematics (John Wiley & Sons, 
Inc., New York, 1965), 2nd ed. 


SUPPLEMENTARY READING / 25 



SYMBOLIC LOGIC 


21 STATEMENTS AND THEIR TRUTH SETS 

Behavioral science like most subjects has its own special vocabulary used 
with ordinary language The set theoretic notation of Chapter l is only used 
if it helps to clarify the discussion In this chapter, we extend the clarity and 
precision of set notation to the analysis of statements by relating set theory 
to logic This allows us to translate behavioral systems into simple sentences 
and then use logical notation and operations to formalize these systems 
Logic deals with statements which can be classified as being either true (T) 
or false (F) but not both The symbols T and F represent the possible truth 
values a statement may have To be considered within the scope of logic, a 
statement must be capable of being assigned a specific truth value Thus, 
such a paradoxical statement as “This sentence enclosed in quotation marks 
is false” will not be allowed in our discussion If the statement is false, that 
is, if it is false that the sentence is false, then it must be true On the other 
hand, if it is true, then by its own admission it is false Either way a contra* 
diction results This situation is similar to Russell’s Paradox and the require- 
ment that a set be well defined 

It is not required that we actually know the truth value of each statement 
used For instance, the sentence “The population of the United States is 190 
million" is acceptable for inclusion in a logical discussion, even though we 
do not know whether it is true or false The important point is that it is pos- 
sible to assign a truth value to the statement 
As in the theory of sets, we begin by specifying a domain of discourse 
(universal set) U All statements are assumed to pertain to the elements of U 
A particular statement will be true for some elements of U and false for 


26 / SYMBOLIC LOGIC 



others With any statement p we associate the set P containing those mem- 
bers of U for which p is true The set P is called the truth set in U or, simply, 
the truth set of the statement p 


Example 1 If U is the set of real numbers, the truth set of the statement 
p x 2 + 3x = 4 

is the set P — { 1, —4} of solutions of this equation If U is the set of posi- 
tive numbers, the truth set of /us Q = {1} ► 

Example 2 Let U be the set of all business establishments in the United 
States Then the set P of profitable businesses may be denoted 

P ~ {z z is a profitable business} 

In this form it is apparent that P is the truth set of the statement 

p z is a profitable business ► 


PROBLEMS 

1 Let A = {a, b, c, d} and let B denote a subset of A What is the truth set 
of the statement B c (a, c} 9 What universal set U is implied here 9 

In Section 11, we indicated that one way of describing a set is to state a 
criterion which members of the set must satisfy The set so described then 
becomes the truth set of the stated criterion Since this may be done for 
any set whatsoever, it follows that there is a complete correspondence be- 
tween truth sets and statements, and hence between set theory and logic 
We shall utilize this correspondence and our knowledge about sets to develop 
the basic concepts of logic 

A statement which is true for all elements of U is said to be logically true , 
or to be a tautology A statement which is never true is logically false, or a 
self contradiction The universal set U itself is the truth set of a tautology, 
while the empty set 0 is the truth set of a self-contradiction For instance, if 
U is the set of real numbers x, then 2x + 2 = 2(x -f 1) is a tautology* 
while at — 3 = x -f 2 is a self contradiction 
To take a behavioral example, suppose we define reinforcement as any 
stimulus which increases the strength of an immediately preceding response 
R Then if learning is defined as an increase m the strength of response R, 
the statement “Reinforcement facilitates learning response R" is a tautology 
With any statement p may be associated another statement not-p, called 
the negation of p, and symbolized by ~p If, for example, p is the statement 
“absolute power corrupts,” then ~p is “absolute power does not corrupt ” 

•A tautology for numbers is usually called an identity 


21 STATEMENTS AND THEIR TRUTH SETS / 27 



SSS3S“SSSS 


PROBLEMS 

2. The price x of a 
( 1 , 2 , 4 , 6 , 8 , 10 ) 
truth set of ~p 7 


certain commodity can be any value in the set U - 
The statement p is “x 2 + 16 = lOx ” What is the 


In addition to the relation between negation and complementation, there 
are other operations on statements which correspond to intersection, union, 
subset, and equality for sets 

Let p be the statement “Material costs have increased” and g the state- 
ment “Labor productivity has declined ” The statement “Material costs 
have increased and labor productivity has declined” is called the conjunction 
of statements p and q and is denoted by p A q The disjunction of p and q, 
denoted by p V q, is “Material costs have increased or labor productivity 
has declined ” 

An expression of the form “If material costs have increased, then labor 
productivity has declined” is a conditional statement (denoted p => q) con- 
necting p and q Finally, the expression “Material costs have increased if 
and only if labor productivity has declined” is a biconditional statement 
(j><=*q) relating p and q 

The symbols ~, A, V, =*, and <=> representing, respectively, negation, 
conjunction, disjunction, conditional, and biconditional, are "the five basic 
sentential connectives of symbolic logic We shall adopt the point of view 
that a connective is defined by the way in which its truth set is related to the 
truth sets of the statements being connected For instance, the following 
definition formalizes our discussion of the negation of a statement 

Definition 1 Let p be a statement having truth set P Any statement 
whose truth set is P 1 is called the negation of p p. 

In deciding upon the truth sets of the remaining connectives, we shall be 
guided by the way in which these connectives are used in ordinary discourse 
In asserting the conjunction p A q, for instance, we normally mean to imply 
that p and q are both true If either p or q is false then the conjunction is 
false If P and Q are the respective truth sets for p and q, it follows that 
p A 9 is true only for those elements of U which are contained in both P 
and Q In other words, the truth set of p A q is P n Q 
Tlic word or is used in ordinary language in two different senses In the 
exclusive sense p or q means either p or q is true, but not both If you say 
The treatment is either effective or ineffective,” everyone knows that the 
exclusnc sense is intended On the other hand, the inclusive sense (the legal 

28 / SYMBOLIC LOGIC 



and/or) is apparent in “Extensive negotiation may weaken our bargaining 
position or prolong the conflict ” Of course, both might happen 
In mathematics and logic p or q is always used in the inclusive sense, p or 
q , or both If the exclusive p or q, but not both, is intended, it will be ex- 
plicitly stated Thus the truth set of p V q is P u Q , the set of elements con- 
tained in either P or Q , or in both 

Example 3 If x denotes a real number, p is the statement x 2 + 3x = 4, 
and q the statement 2x 2 — 6x = —4, then the truth set of p A q is {1}, 
the common solution of the two equations The truth set of p V q is 

a 2 , - 4 } ► 


PROBLEMS 

3 Verify the results stated in Example 3 

When stating the conditional “If p, then q” we are claiming that if p is 
true, then q is also true Hence, if e is an element of U for which p is true — 
that is, if e e P — then we must also have e e Q To say that the conditional 
p=>q is true, then, is to say that P is a subset of Q 

Looking at it another way, the conditional p => q is false only for those 
elements e of U which violate the condition P c Q, that is, only for an e 
which is a member of P, but not a member of Q These are the elements in 
P O Q\ shown unshaded in Figure 1 The remaining elements, those for 
which p=*q is true, are members of the set (P n Q') f = P r U Q 


FIGURE 1 



The truth set P f U Q is represented by the shaded region of Figure 1 
Note that within this truth set, an dement c for which p is true (e e P) is 
also an element for which q is true ( c e Q) However, the truth of p is not 
necessary for the truth of p=> q In fact, since P' is entirely shaded it follows 
that p => q is considered true whenever p is false, regardless of the truth or 
falsit) of q 

As most persons initially find this point of view at variance with their 
intuitive feelings, there mi) be sonic objections to our procedure In particu- 
lar, it might be maintained that in order to express the conditional * if />, then 
q ’ the statements p and q should be related in some wa> 

21 STATEMENTS AND THEIR TRUTH SETS / 23 



But how is one to decide which statements are related and which are not 
Certainly any intuitive feelings about relatedness will vary from one person 
to another and will depend on the context in which the statements appear 
Our approach has been deliberately chosen to avoid the complications which 
arise from questions of relatedness, the definitions being made in such a way 
that the truth set of a logical form depends only on the form itself and not on 
the specific statements involved 

Granting, then, that truth sets should depend only on the logical form, 
have we made the proper decision for the conditional form p=* q when p 
is false For the case when q is also false, consider the statement 
If Canada has 300 million inhabitants and the United States only 200 
million, then the population of Canada is greater than that of the United 
States 

For the case when q is true, consider 

If Canada has 300 million inhabitants and the United States has 400 mil- 
lion, then the population of the United States exceeds that of Canada 
It is hard to imagine anyone denying the truth of these statements But tf 
this is the case, then the truth set for the conditional form is determined, and 
must be as we have chosen it 

The biconditional p <=* q is intended to mean (p => q) A (q => p) Since 
the conjunction of two statements is true only when both statements are 
true, the truth set of p <=> q must be the intersection ofP' u Q (the truth set 
of P => <?) and P U Q' (the truth set of q p) This intersection is the set 
(P H Q) U (P O Q) shaded in Figure 2 We see that the biconditional is 
true either when p and q are both true or when they are both false 


figure 2 



raUumbl ?’ tlK 9 the set contam.ng a] 

real numbers except the integer —4 The truth '.-t n r „ ~ , „ 

numbers except -4 and 2 truth set of p « q contains all rea 


PROBLEMS 

4 Work out the details or Example 4 
Definition 2 summarizes our discussion 


M / SYMBOLIC LOCIC 



Definition 2 Let p and q be statements having truth sets P and Q , re- 
spectively. Then 

(a) the conjunction p A q is a statement whose truth set is P n Q ; 

(b) the disjunction p V q is a statement whose truth set is P u Q m , 

(c) the conditional p => q is a statement whose truth set is P’ U Q; 

(d) the biconditional p <=> q is a statement whose truth set is 

(Pn0u (P' n Q') ► 

Definitions 1 and 2 may be used to determine truth sets of statements in- 
volving combinations of the basic connectives. 


Example 5 To find the truth set of the statement 
~[P A (~p V q)] A q 

we work from the innermost parentheses outward, as follows: 


( 1 ) 


1 . The truth set of ~p is P 1 

2. The truth set of ~p V q is P' U Q 

3. The truth set of p A (~p V q) is 

Pn(P'u0 = Pne 

4. The truth set of ~[p A ( ~p V q )] is 

(P n Q)' = r u Q! 

5. The truth set of (1) is 

(P' u C') n Q = P’ n Q 


[Definition 1] 

[Definition 2(b)] 

[Definition 2(a) and distributive law 
for sets] 

[Definition 1 and De Morgan's law 
for sets] 

[Definition 2(a) and distributive law 
for sets] w 


PROBLEMS 

5. Find the truth sets of the following: 

WpAp (b) =* q (c) p A ~p 

The next example shows how to determine truth sets of logical forms which 
combine three or more distinct statements. 

Example 6 The truth set of 


[(/) A (?) V -/)] => (r =» 9 ) 

(2) 

P U Q U R’ 

(3) 


2.1 STATEMENTS AND THEtR TRUTH SETS / 31 



determined as follows 

1 The truth set of (p A g) V (~p) is [Definitions 1, 2(a), 2(b), and dts- 

n , n . _ tributive law for sets] 

(P n Q) uP' = P' u Q 

2 The truth set of r =$ q is R’ U Q [Definition 2(c)] 

3 The truth set of (2) is 

[P' u QY U [R' U Q] [Definition 2(c)] 

This latter expression may be reduced to (3) by use of Theorem 1 of Chap- 
ter I, or by studying the diagrams shown in Figure 3 We have shown that 
the logical form (2) is true if either p or q is true, or if r is false That is, (2) 
has the same truth set as p V q V ~r w 


FIGURE 3 The truth set of (2) is indeed, given by (3) since the total shaded 
regions of both Venn diagrams are equal 



U> 


PROBLEMS 



mins letters to'standfor teTmplemsme^o" 0 Symb ° I,C n ° ta,,on 

S^::;^~j :rTr rhe W , Unot 

marginal revenue of the last man h,r J Unt ° f labor such that the 
he will obtain maximum profits approximates his wage, then 
( c ) Either the product is fanltv 

department or the product ,s not'fauUy^nd ** ^ qUaI ' ty Contro1 
quality control department V nd ,s not acce Pted by the 

M) The state legislature will determine «,<• 

ceivcs a majority of the popular vole lf no candlda 'e re- 


3* / SYMBOLIC LOGIC 


(e) If John takes the exam and scores over 70, he does not have to take 
the course for credit 

(0 John Smith is not an employee of the ABC Company or the XYZ 
Company 

(g) Theories either explain events or they are eventually discarded and 
replaced by those which do 

7- In the following example, determine the truth value of the compound 
statements (a)-(c) given the truth values of the following component 
statements 

(i) “Mr Black has a more favorable attitude than Mr Blue” is true 
(u) “Mr Greene has a less favorable attitude than Mr White” is false 
(m) “Mr Blue has a neutral attitude” is true 

(iv) “Mr Brown has equally as favorable an attitude as Mr Black” is 
false 

(a) If Mr Black has a more favorable attitude than Mr Blue, then Mr 
Greene does not have a less favorable attitude than Mr White 

(b) If either Mr Brown does not have equally as favorable an attitude 
as Mr Black or Mr Greene has a less favorable attitude than Mr 
White, then Mr Blue has a neutral attitude 

(c) If Mr Brown does not have equally as favorable an attitude as Mr 
Black, then either Mr Greene has a less favorable attitude than 
Mr White or Mr Blue does not have a neutral attitude 

8. Suppose we assume that the statements 

P = consumption is a function of total real profits and total real wages 
q = net investment is a function of profits and capital stock 
r = private wage bill is a function of net national income and indirect 
taxes 

are true, and that the statement 

s = net national income plus indirect taxes is a function of investment 
is false Find the truth values of the following statements 

(a) (p<=>q) => (s<=> r) 

(b) (p A q) A r 

(c) s=>p 

(d) p ==> fa <=> (r => s) ] 

(e) (/>A?)«(rA ~s) 

(0 (~/7 => q) => (s => r) 

(g) p => [q => (q =* p) 1 

9. Let p, q, and r be statements has mg associated truth sets P, Q, and P, 
respective^ Which of the following statements are tautologies'* 

(a) (p V q) (q V P ) 

(b) j>=»(pVq)Vr 

21 STATEMENTS AND THEIR TRUTH SETS / 33 



(c) p=> 19 =* <« =* P>i 

(d) Up =>«)<=> «] =* ^ 

(e) (p A <!) => Cp V 

(0 Up A?)=* Kp A ~P ) 1 1 
(g) CP=»9)« <9=>P) 


(,, V 1 A (9 =* «' 


1 C Let V denote the exclusive “or ” That is, 

pVc = (pVc)A~CpAc) 


Finish filling Table 1 


TABLE 1 Truth table for )£. 

Truth value Truth value ToisOvetueof 

ofp of 9 P ^_L_ 


T F 

F 7 

F F 


11 Write truth tables for the five basic connectives Tabic 2 is a partial 
answer to this problem 

TABLE 2 Troth table for conjunction A 


Truth value Tiuth value Truth value of 

of p ofj? F A q 

t r ~7 

T F F 

F T F 

F F F 


12 Show that the statement ~p A l<7 A (~q V p)] is a self contradiction 

22 EQUIVALENCE AND IMPLICATION 

OF the five sentential connectives, two — the biconditional and the condi- 
tional — are of fundamental importance in mathematics However, these 
connectives are ordinarily used in a somewhat restricted form In mathe- 
matics, as in everyday discourse, when we state the biconditional p <=? q, we 
are claiming that it is true We wish to rule out the possibility that one state- 
ment, p or q, is true and the other false A biconditional statement p <=$> q 
which cannot be false, that is, whose truth set is the universal set U, is called 
an equivalence between p and g A glance at Figure 2 shows that equivalence 
of a pair of statements may be deduced from equality of their respective 
truth sets 


34 / SYMBOLIC LOGIC 



Since equivalent statements have identical truth values, they may be sub- 
stituted for one another in a logical form. The same considerations arise as 
in algebra where, for example, it is useful to know that (x + y) 2 may always 
be replaced by x 2 + 2 xy + y 2 (or vice versa). Substitutions in logical forms, 
as in algebraic expressions, are made with an eye toward reducing the com- 
plexity of an expression or making it more amenable to the solution of the 
problem at hand. Table 3 lists some important equivalences which corre- 
spond to the set theoretic equalities detailed in Theorem 1 of Chapter 1. 


TABLE 3 


Logical terminology 

Equivalent statements 
Statement 1 Statement 2 

Corresponding 
set theoretic 
result 

Law of double 
negation 

~(~pi 

P 

(p'Y = p 

Idempotent laws 

P V P 

P 

Idempotent laws 


P A p 

P 

for sets 

Commutative laws 



Commutative laws 




for sets 

Associative laws 

(p V <r) v r 

PV (qV r) 

Associative laws 


(P A q) A r 

P A |?Ai) 

for sets 

Distributive laws 

p V (? A r) 

(P V q) A (P V r) 

Distributive laws 


P A (p v /) 

(pAp)V(pA r) 

for sets 

De Morgan's laws 

~(p V q) 

(~p) A ( — <7) 

De Morgan's laws 


~(p a q) 

<~P) V ( ~q ) 

for sets 

Law of the excluded 

p V ~p is always true (is a 

P U P' = U 

middle 

Law of contradiction 

tautology) 

p A ~p is never true (is a self- 

POP' - 0 



contradiction) 



Example 1 The statement (q V ~q) => ~p is equivalent to ~p since its 
truth set is 

( Q U Q'y U P' = U' U P r 
= OU P’ 

- p' 


which is the truth set of 

Since the two statements ( q V ~/> and have the same truth set 

P\ cither of these statements may be called the negation of statement p. This 
explains what may have seemed to be odd wording in Definitions 1 and 2, 


2-2 EQUIVALENCE AND IMPLICATION / 35 














which allowed a statement to have many negations or a pair 
to have more than one conjunction, disjunction, conditional, or 


of statements 
biconditional 

► 


problems 

1 Show that P U Q IS the truth sel of ~p -» « and hence that this stale- 
ment is a disjunction of p and Q 

Like the biconditional, the conditional also is usually stated only when 
it is true If the conditional statement p => q cannot be false, then it is colled 
an implication and we say that p implies q 
A look at Figure l shows that p implies q if and only if there arc no ele- 
ments x such that x € P and x £ Q That is to say, p implies q whenever P 
is a subset of Q In this case, q is true whenever p is true 


Example 2 Does the conjunction of the statements “Interest rates do not 
increase or taxes increase” and “Interest rates increase” imply the statement 
“Taxes increase”’ 

Let p be “Interest rates increase” and q be “Taxes increase ” The problem 
is whether (~/> V q) A p implies q Using Definitions 1 and 2 we find that 
the truth set of (~p V q) A p is 

(P' u Q) nP « (P' nP)u(Qnf) 

* QnP 

Since Q n P c Q, it follows that q is indeed implied by (~p V q) A p 
Incidentally, Q n P is also a subset of P, of Q n P, and of Q U P Hence 
the statements p, p A q, and p V q are also implied by (~p V q) A p In 
fact, any statement is implied whose truth set contains P n Q ► 


PROBLEMS 

2 Does the conjunction of the statements ‘ If interest rates increase, then 
taxes decrease ’ and ‘ Taxes decrease’ imply the statement “Interest rates 
increase 0 


Implication differs from other sentential connectives in that it lacks sym- 
metry While p A 5 is equivalent tojAp.fVjis equivalent to q v p, 
and p <=* g is equivalent to q » p, p =» q is not equivalent to q=>p 
For instance, if p “The Dodgers win the pennant” and q “The flnalstand- 
mgs show the Dodgers ahead of the Mets" are two statements about this 
year s National League race, then p implies q but q certainly does not imply 
P There are many possible outcomes or the pennant race m which q is true 
and p false 

S “™ cnt 9 =» t IS calW r the of P => 9 Many common falla- 

cies in thinking arise from confusion of a statement with its converse For 
example, if p ,s the statement “The enemy halts his aggressive acts" and q 


36 f SYMBOLIC LOGIC 



the statement “U S forces will be withdrawn,” it is obvious that p => q is 
not equivalent to q =* p 

The implication p q is true if the corresponding truth sets are related 
hy P c Q But if P £ Q, then it must be that Q' c P\ and conversely That 
1S > P Q Q is equivalent to Q r c P f Since Q r and P' are the respective truth 
sets of ~q and ~p, it follows that p =» q is equivalent to ~q => ~p The 
form ~q => is called the contrapositive of the implication p=$ q 

Example 3 A well-known perceptual law states that if an observer looks at 
a colored object for a sufficiently long period of time and immediately after- 
ward looks at an illuminated neutral surface, he will see an after-image of 
the object in the complementary color 

If we let 

p ~ an observer looks at a colored object for a sufficiently long 
period of time 

q = the observer looks at an illuminated neutral surface imme- 
diately afterward 

r = the observer sees an after-image in the complementary color 
the law can be expressed in symbolic notation as 
p A q=> r 

The contrapositive of the implication is ~r =» ~(p A q) which can be re- 
written as ~r =► ~p V ~q This states that if an observer did not see an 
a fter image of complementary color, then either he did not look at the ob- 
ject for a sufficiently long period of time or he did not look at an illuminated 
neutral surface immediately afterward ^ 

An implication p => q, then, is equivalent to its contrapositive ~q =* ~p, 
hut is not equivalent to its converse p Of course, the converse q *=> p 
ts e qmvalent to its contrapositive ~p =*• ~q Thus, associated with any im- 
plication are three other implications, the four forming two pairs of equiva- 
e nt statements These statements are summarized in Table 4 

The language used to express implications takes a variety of forms For 
instance, the statement “Interest rates will rise only if there is excessive in- 

TABLE 4 Original implication p => <7 and 

associated implications 



Original statement 


Converse 


Contrapositive 


Contrnpositivo ol converse 


These state 
rnents rite not 
~P — equivalent 


-n cquiv 


22 EQUIVALENCE AND IMPLICATION / 27 



fliticm” (p only if q) expresses the same thought as the implication 
rates rise then there is excessive inflation” (if p, then q) If it is true a 
only time interest rates rise is in the presence of rapid inflation, then ar l ob- 
served rl5e hi interest rates allows the conclusion that such inflation must 

have occurred „ . . »> 1C 

Another way commonly used to express the implication if p, then q is 
to say that p is a sufficient condition for q That is, m order to conclude q t it 
is sufficient to know p The statement “p is a necessary condition for q is 
equivalent to “if q , then p” and to “ q only if p ” If q occurs, it necessarily 
follows that p also occurs The assertion of a necessary condition is the con- 


verse of the assertion of a sufficient condition 

Equivalence between two statements p and q may be expressed either by “p 
if and only ifq ” or by “p is both a necessary and a sufficient condition for q ” 
A source of confusion between implication and equivalence arises from 
the way in which definitions are stated For instance, the concept of power 
is one of central importance to the political scientist A definition often used 
is “If A can influence the behavior of B, then A has power over B ” In logical 
form this is a conditional statement (p =* q) claiming that influence is a 
sufficient condition for power If influence is a sufficient but not a necessary 
condition for power, then it is possible that A has power o\er B but no in- 
fluence In this case there are conditions of power other other than those 
specified and the definition is deficient in that it fails to specify the necessary 
conditions 

Properly stated, a definition specifies an equivalence between two state- 
ments However, it is almost universal in science and mathematics to state 
definitions m conditional terms (p => q), as above, it being understood that 
equivalence (q => p as well as p => q) is intended Thus, in the illustration 
above the possession of power is intended to be equivalent to the ability to 
influence behavior 


PROBLEMS 

3 s? "’■x r , b n s,ateraents mth sets P, Q, and K, respectively 
Show that the following are pans of equivalent statements 

(a) ~(~p A ~g) and p V q 

(b) (p A q) V (p A cf) and p 

(c) ~(~p A (p V q)] and p V ~g 

(d) ~|~p A (g A /•)] and p V ~q V ~r 

(e) ~ [p A (p V q)l and ~p 

<0 g A ~[p A (~p V g)] and~p a g 

4 Serm I” 1 ' MI °' Vln8 Sla,emeMs “= =qu, valent to the statement 
“ C 80V ' mTO “ SUbS ’' , " S is anTver 

“ ‘ hat ,h " e 8 ™ ‘ — - and no 


33 / SYMBOLIC LOGIC 



(b) Either there are government subsidies and overproduction or there 
is overproduction 

(c) There are government subsidies or there is overproduction and no 
government subsidies 

(d) There are neither government subsidies nor overproduction 

(e) There are government subsidies and no overproduction 

(0 If there are no government subsidies, then there is overproduction 

S. Which of the following conditional statements are implications'' 


(a) ~(~p) =>P 

(b) p A 9 => 9 

(c) (p=s~p)=»P 

(d) (p A 17) =» (p V r) 

(e) (pA?)=>(f°« v <•) 

(0 pA?=>f v« 

(g) (<7 V ~q) =» P 

(h) [~V A (/> => 9)1 =» ~P 

6. If V is as defined in Problem 10, Section 2 1, show that 

(a) P V p is a self contradiction 

(b) p V 9 is equivalent to p when 9 is false 

(c) p V 9 is equivalent to ~p when <7 is true 

(d) p V q is equivalent to <7 V p 

(e) p V (9 V r) and (p V 9) V r are equivalent 

(0 p A (9 V r) and (p A 9) V (p A r) are equivalent 

7. Find truth values for p, 9. and r such that 

p y (9 A r) and (p V 9) A (cY ') 


have different truth values What does this say concerning the equ.va- 
lence of these two statements n 

* »» .. - .'»• “rr."S sssstzs 

creased housing expenditures do not accompai y 

la” Increased^housing expenditures accompany demolition only ,f slums 
(b) d ”ary condition for mcmasedho^ penditures ,0 accom- 

« Zs ": : z — — — ~ y 

(d) /^sufficient condition foMhe^ntmua^mn^sIums is InCKaSC< * 

housing expenditures do not demolition, then slums 

(e) If increased housing expenditures acco p 

will disappear housing expenditures accompany 

(0 If slums disappear, then increased housing cxpc 

demolition 

22 equivalence and ,mpl, cation / 30 



is valid since the intersection of the truth sets P' U Q and Q' U R of the 
premises is a subset of the truth set P' U R of the conclusion 
The seeming paradox of a valid argument resulting in a false conclusion 
is resolved if we observe that the second premise m the argument is ob- 
viously false It is not surprising that a false premise should lead to a false 
conclusion, even though the conclusion is correctly derived from the premises 

PROBLEMS 

1 Draw a Venn diagram to verify the validity of the argument in Example 1 


Example 2 

If Lincoln was born in Quebec, then Lincoln was 
born in Canada 

Lincoln was not born in Quebec 

Therefore, Lincoln was not born in Canada 

The conclusion, of course, is true But the argument is not valid, the truth 
set of the conjunction of the premises being (P f U Q) n P — > w 1C * 

not a subset of Q r , the truth set of the conclusion 


An argument which is invalid is called a fallacy Examp e f 

one of the most common fallacies in which the if part, or an ec * . 

conditional statement (p in the example above) is denied in an a P 

deny the “then” part, or consequent ( q above) , t „ 

A form of valid argument closely related to the aw 0 ® consequent 

Modus Tollendo Tollens, the method of denying (tollendo) 
of a conditional statement in order to deny (tollens) the anteceden^ This 
fype of argument is displayed in Table 6 Replacing t e pre 
“» equivalent contrapos.tive form => ~P > hat that 

Law of Detachment in disguised form Alternatively, w Y 
*e conjunction Q’ n {T U 0 - 6' n F of the truth sets of the premises 
is a subset of/*' Hence this is a valid form of argument 

TABLE 6 Modus Tollendo Tollens 


First premise 
Second premise 

Statement 

~<7 
/?=> Q 

Truth set 

O' 

P UQ 

Conclusion 

~P 

P 


Other common forms of valid arguments are bsted jn con junction 
is illustrated in the following examples For simplicity 

23 VALID ARGUMENTS / “ 



2 3 VALID ARGUMENTS 


conjunction of the premises p and p =» 9 allows you to conclude (| 
of reasoning is tabulated in Table 5 


TABLE 5 The Law of Detachment 



Statement 

Truth set 

First premise 
Second premise 

P 

P=> d 

P 

P UQ 

Conclusion 

Q 

Q 


An argument consists of two major components, a collection p i , p z . * P» 

of statements called premises, and a statement q , the conclusion The argu- 
ment is said to be valid if the conjunction of the premises implies the conclu- 
sion Since a conditional statement p => q is an implication if and only if the 
truth set of p is a subset of the truth set of q, it follows that an argument is 
valid if and only if the intersection Pi n P 2 n n P n of the truth sets of 
the respective premises is a subset of the truth set Q of the conclusion 
In our example, the intersection of the truth sets P and P' U Q of the 
premises is P n (P 1 U Q) - P r» Q Since this latter set is a subset of Q it 
follows that the conjunction of the premises does indeed imply the conclu- 
sion Hence the argument is valid 

Note that the validity of an argument depends only upon its logical form 
and not upon the truth or falsity of the particular statements involved 
The fact that a valid argument is an implication means that the conclusion 
must be true whenever the premises are all true However, if the premises 
are not all true, then the truth of the conclusion is neither necessary nor suffi- 
cient for the validity of the argument 


Example 1 

If a person is a leader, then he has power to influence others P=>Q 

If a person has power t o influence others, then he is respected q => r 

Therefore, if a person is a leader, then he is respected by others p »> r 

Obviously, the conclusion is false History is replete with examples of 
despotic leaders who were feared but not respected However, the argument 

40 / SYMBOLIC LOGIC 


(3) (p A ~g) => ~p 

(4) (pA~g)=>? 

(5) (p A ~q) =» (r A ~r) 

In cases (3)— (5) we see that in venturing an indirect proof we are permitted 
to introduce the negation of the conclusion as another premise This adds to 
the available information and is especially convenient in cases where it is 
difficult to see how to begin a direct proof There are then a variety of con- 
clusions, any one of which establishes the proof 

Case (5) is a particularly useful form closely related to the Law of Absur- 
dity If from the conjunction p A ~q, we are able to imply the false state- 
ment r A ~r, then p A ~q must itself be false Hence if p is true, ~q must 
be false That is, if p is true, then so is q , which means p=> q 


Example 4 If the President appointed a candidate (p), then the Senate 
confirmed the appointment (c) Either the candidate was acceptable to the 
Senate (a), or the Senate did not confirm the appointment It is not the case 
that the President appointed a candidate and the candidate was confirmed 
by the Senate Therefore, the President did not appoint a candidate 
Since the argument involves an implication and two disjunctions, it is 
difficult to begin a direct proof However, by introducing the negation of ttie 
conclusion as another premise we arrive at the following indirect proo 


Premtse(s) Used 

Statement 

1 

(I )P=> C 

2 

(2) a V ~c 

3 

(3) ~(p A c) 

4 

(4) p (negation of 
conclusion) 

1,4 

(5) c 

1,2,4 

(6) a 

3 

(7) ~p V ~c 

1, 2, 3, 4 

(8) ~p 

1, 2, 3, 4 

(9) p A ~P 

1,2,3 

(10) ~P 


Derived from 

[Premise 1] 

[Premise 2] 

[Premise 3] 

[Premise 4] 

[Statements (1) and (4)] 
[Statements (2) and (5)] 
[Statement (3) De Morgan s Laws] 
[Statements (5) and (7)] 
[Statements (4) and (8)] 
[Statements (4) and (9) using case 

(5) above] 


Since introducing the negation of the concluaon results in a ^ 

" e conclude that the conclusion is indeed implied by the P 


problems 


ems 

1 4 the truth set of the con 
Draw a Venn diagram to show that in Examp e h ^ 0 f the con- 

junction of premises (l)-(3) is a subset of P , 
elusion ~p 


23 VALID ARGUMENTS / « 



of the premises is written on 
of the form 


, he same line with the conclus.on and the name 


TABLE 7 Useful forms of va lid argument _ 

Premises Conclusion 


Law of Detachment 

Modus ToHendoTollens 

Modus Tollendo Ponens 

Law of Simplification 

Law of Addition 

Law of Exportation 

Law of Importation 

Law of Hypothetical Syllogism 

Law of Absurdity 


pA(/J^9) 

~q A (P *=> V) 
-PA(PV <7) 

P A 0 
P 

(P A Q)=>' 
p =» (0 => 0 
(p=><f) A (q=^r) 
p=* (g A ~£j) 


q 

~P 
1 7 
P 

pV<J 

p=> [qzs*t) 
(p A <7) => ' 

p=*r 

~P 


PROBLEMS 

2 Prove the validity of the Law of Exportation and give an example of tts 
use 

In Section 22 it ms shown that the conditional p => q may always be 
replaced by the equivalent contrapositive form ~q => ~p Applied to argu- 
ments this means that showing at least one of the premises must be false 
when the conclusion is false is equivalent to making the original argument 
The next examples illustrate this method of indirect argument 

Example 3 Suppose we wish to prove the theorem If n is a positive in- 
teger and n 2 is odd, then n is also odd The equivalent contrapositive form 
is If n is not odd (is even), then n 2 is not odd (vs even) Now if n is even, 
then n ~ 2k where k is some integer Hence n 2 — (2k) 2 ~ 4 k 2 — 2(2 k 2 ) 
is also even, being twice the integer 2 k 2 This establishes the theorem ► 

PROBLEMS 

3 Try to find a direct proof or the theorem of Example 3 

Usually an indirect proof is attempted when it is easier than a direct proof 
(as in Example 3) or when it vs difficult to know just where to begin an 
argument from the premises given There are a variety of forms m which 
one may cast an indirect proof A little checking will show that the following 
conditional forms are all equivalent and hence provide alternative forms for 
argument 

0) J> ** 9 

(2) ~?=> ~p 


42 / SYMBOLIC LOGIC 



ods are harmful to initiative, they are uneconomical Merely mechanical 
labor is boring Therefore, merely mechanical labor is not socially 
desirable 


6. Either a social institution satisfies some need, or it will not survive A 
social institution satisfies some need Therefore, it will survive 


7. If an animal is not satiated, the animal will enter the goal box on the 
subsequent trial If the animal enters the goal box on the subsequent 
trial, then it is not the case that the strength of the goal seeking response 
was reduced on the previous trial The strength of the animal s goal- 
seeking response on the previous tnal is either reduced or increased It 
is known that the strength of the response was not increased on the 
previous trial Therefore, the animal is satiated 


8 If the fossil remains are human, then if the dating process is accurate 
these remains are the oldest human remains yet discovered Eit er t e 
fossil remains are human or the finding will not receive worl W1 ® ^ 
claim It is a fact that the dating process is accurate Therefore, it the 
finding is given world wide acclaim, the remains are the o est uman 
remains yet discovered 

9. If the Bilge Company is able to purchase raw material at a favorable 
price or if the sales of the company increase, the company wi n 
a loss If there is a shortage of material, the Bilge ompany 
able to purchase raw material at a favorable price e ac 
is no shortage of material Therefore, the Bilge Company will not sufTer 
a loss 

10 If the candidate receives the support of the liberals he will 

of labor, then he W.1! wm Either .he candtdate w.11 not mn « ^ w.H 
introduce a bill favorable to labor The cand. a e the sup . 

bill favorable to labor Furthermore, the candidate w ^ 

port of the liberals Therefore, the candidate will not receive the support 

of labor 

[Hint Try an indirect proof ] 


Example 6 Classical, or Aristotelian, logic is concerned with relations 
“mong classes (sets) of objects A typical example is 
No Fascist is a Communist 
All Chinese are Communists 
Therefore, no Chinese are Fascists 

% introducing the sets F, C„, and C» of notation ns 

cspectncly, these statements may be translate 

pec; 

CtSC, 

2J VALID ARGUMENTS / « 



Example 5 Davis* has listed 56 formal propositions drawn from the 
writings of sociologists and social psychologists By slightly modifying some 
of the propositions for the purpose of simplification, we can advance the 
following general argument 

Person P, and Person P 3 are friends (P.PP,) if and only if P, likes P 3 
(P,LPj) and P 3 likes P, If P, and P, are friends, then they are similar in 
attitude (P t SPj) If P, likes P 3 and P 3 dislikes P k , then P, will dislike P*. 
If P, likes P } and P 3 likes P k , then P t will like P k If P, dislikes P, and P 3 
likes Pft, then P x will dislike P k Suppose it is the case that P\ and P3 are 
friends and P 2 likes P 3 Therefore, Pi and P 2 are similar in attitude. 

The validity of the above argument is verified m the following analysis 
It is assumed that P» either likes or dislikes P 3 , but not both 


Premise(s) Used 
1 
2 

3 

4 

5 

6 
7 

1,2 

3 

3.6 

3, 6,7 

1 

1.7 

4 

1, 4, 6, 7 
1, 2, 3, 4, 6, 7 


Statement 

(1) P t FP } <=> P x LP 3 A P 3 LP x 

(2) P X FP, => P t SP 3 

(3) P,LP, A ~(P,LP t )=* ~(P,LP t ) 

(4) P.PP, A P,LP,, =J P,LP t 

(5) ~(.P,LP,) A P,LP t => ~(P,LP k ) 

(6) P 2 iP 3 

(7) PiFPs 

(8) P,LP, A P,LP, =5. P.SP, 

(9) Pi£P 3 => p l ip i v 

(10) PiLP 3 =, p 1 lp 2 

(11) P,iP 2 

(12 ) P l FP 3 **P 3LPl 

(13) P 3 iPj 

(14) P 2 LP 3 A P 3 LP l P 2 LP, 

(15) P 2 iP, 

(16) P,SP 2 


uerivea jrom 
[Premise 1] 

[Premise 2] 

[Premise 3] 

[Premise 4] 

[Premise 5] 

[Premise 6] 

[Premise 7] 

[Statements (1) and (2)] 
[Statement (3)] 

[Statements (6) and (9)] 
[Statements (7) and (10)] 
[Statement ( 1 )] 

[Statements (7) and (12)] 
[Statement (4)] 

[Statements (6), (13), and (14 
[Statements (8), ( 11 ), and (15 


In contrast to Example 4 where there * 
fa a direct proof and where the negation ofV™ *? b= enough Premises 
another premise, the number of premises conclus >°n was offered as 

one since the conc.uston may be obtained t 

PROBLEMS 

In Problems 5-10 determine whether the arguments are valid 

' 11 work methods are ^onng^they ar e t ha^ b y| y t are « ™Hr desirable 
•Davis, J A , “Structural Balance M k ° 1Iutlatlve work meth- 

American Journal of Sociology 68, . Meehan, ^Sohdamy, and I„ terpersonal 

44 / SYMBOLIC LOGIC 



17. A set of premises is said to be an inconsistent set if it is impossible that 
all be true at the same time, that is, if the truth set of their conjunction 
is empty 

(a) Which of the following sets of premises are inconsistent ’ 

(l )~(pV~9) (n )/>=»<? 

pV~r poet r 

q=*r (iV J>«~7 

(b) Prove that from an inconsistent set of premises one may vahdly con- 
elude any statement whatsoever 

18. Prove the validity of the arguments listed in Table 7 


(m) p => q 
q=> r 

P A s 


supplementary reading 

Kemeny, J G , SneU, I L , and Thompson, G L , Introduction i 
mattes (Prentice-Hall, Inc , Englewood Cliffs, N J , 1966), 2nd ed , Chapter 
Luchins, A S , and Luchins, E H , Logical Foundations of Mathematics for Be- 
hauoral Scientists (Holt, Rinehart and Winston, Inc , New York, 19K>) 

Longer, S K , An Introduction to Symbolic Logic (Dover Publications, Inc , New 
York, 1953), 2nd ed 

Suppes, P, Introduction to Logic (D Van Nostrand Company, Inc, Princeton, 
N J , 1957), Chapters 1 and 2 


entary hca 


SUPPLEM 


ding / 



Therefore, 


C*cF' 


„ , , . r „ _ nd W c A 1 arc equivalent statements about sets, we 

patent to C„ E C. and C. = **. To- 
geto these" imply the conclus.on C„ c F' and the argument .s vahd ► 


problems 

Use the method of Example 6 to 
Problems 11-14 are valid 


determine whether the arguments in 


11 . 


In some forms of government, the individual is subservient to the state 
All governmental systems which make the individual subservient o 
state are nondemocratic Therefoie, some forms of governments are 
nondemocratic 


12 Television programs are designed for maximum audience appeal Nothing 

designed for maximum appeal can be of high quality Therefore, no tele- 
vision program is of high quality 


13 Some Democrats are liberals No liberals are conservative Hence, no 
Democrats are conservative 


14. All subjects are male or overachievers or anxious All males are students 
or anxious All anxious males are students and overachievers Some 
overachieving males are students Therefore, some subjects are students 

15 A young man was imprisoned in a cell having two doors, one of which 
led to freedom while the other led to certain death The jailor, being a 
sporting man, agreed to give the prisoner his choice of doors The cell 
was guarded by two guards, one of whom, the prisoner was told, could 
tell no lies, and the other could tell no truth The prisoner was permitted 
to ask one question which each guard would answer with a simple “yes” 
or “no ” Our prisoner, having been a student at Professor Foole’s logic 
school, asked a question and then walked out, a free man How did he 
know which door to choose 7 

16 A personnel manager is seeking an individual who is adept in problem 
solving From a number of applicants, he selects three promising candi- 
dates and uses the following test to eliminate all but one The three 
individuals are blindfolded and seated in a triangle They are instructed 
that a black or white mark is to be placed on each forehead When the 
blindfold is removed, they are told to look at each of the others If they 
see a black mark, they are to raise their hand until they can reach a 
decision as to the color of the mark on their forehead Then they are 
to stand and give the reason for the decision Unknown to the applicants, 
the personnel manager places a black mark on each forehead After a 
penod of time, one applicant rises, and indicates that the mark on his 
forehead is black How did he arrive at that decision 7 


46 / SYMBOLIC LOGIC 



FIGURE 1 



FIGURE 2 



In general, ordered pairs are considered differen are equa i 

»me entries, in the same order. That is, the pairs ff - ’ g„i shes Ordered pairs 
>f and only if a: = r and y = s. This P™P er ' y . } anc j {v x } are 

from sets having two elements. For, while t e se ’ . , j,) an( j 

identical, since they have the same members, the ordered pairs (x.yj 

O’, ar) are not equal unless x = y- n „mbers. For instance, the 

The objects in an ordered pair need not first clcmcnt ; s a 

list of grades for a class is a set of ordered pairs obviously we want to 
indent’s name, and the second element is h.s grade. Obvious, y, 

3., BINARY relations / « 



tffi 

RELATIONS <03 


31 BINARY RELATIONS 

In ordinary discourse we think of relations as holding between two things, 
or among several things We speak of John Adams as being in the relation of 
father to John Quincy Adams We express the relative standings of the 
National League baseball teams by indicating the number of games each 
team is behind the leader 

Relations which hold between pairs of objects are called binary relations, 
those involving three objects are ternary relations and, in general, an “n-ary' 
relation connects n objects As well as being the simplest relations, binary 
relations are the type most used in the behavioral sciences We shall therefore 
devote most of our discussion to these relations 
From your everyday experiences you are familiar with various kinds of 
relations connecting two people Examples are family relations such as 
mother, father, uncle, sister, and cousin, and emotional relations such as 
loving and hating Relations involving pairs of numbers have appeared in 
your study of mathematics To say that a real number x has the relation 
greater than to a real number y is to say that the pair (x, y) lies in the shaded 
region of Figure 1 Similarly, x is equal to y if and only if the pair (x, y) lies 
on the line passing through the origin and inclined 45° from the horizontal 
A word about pairs is perhaps in order here A pair of objects in which we 
distinguish one of the objects as the first and the other (which need not be 
different) as the second is called an ordered pair The most familiar use of 
ordered pairs is the representation, as in Figure 2, of points on a plane by a 
pair of numbers, the first of which denotes the horizontal distance and the 
second the vertical distance from a fixed point O, called the origin In this 
interpretation it is apparent that the pairs (I, 5) and (5, 1) represent different 
points and should be considered to be different pairs of numbers 


48 / RELATIONS 



Example 2 The political scientist interested in voting behavior relative to 
a specific elective office might define the relation “votes for” as a set of all 
ordered pairs (x, y) such that y is the candidate chosen by voter x. 

If he were interested in a slate of four offices, the quinary relation “votes 
for candidates” consisting of the set of ordered 5-tuples (x, y\, y 2 , P 3 > ydt 
such that y h y 2i y 3 , and >>4 are the respective candidates chosen by * for 
the four offices, would be of greatest value to his work. ► 


PROBLEMS 


1. Which of the following are relations? 

(a) {(x,y): x and y are real numbers} 

(b) (Red card, 10 seconds) 

(c) {(x,y): * and y are real numbers and y = 5} 

(d) {(J. Oates, 27 years)} 

(e) {10 years seniority, 120 I.Q., machinist} 

(0 {(5, IBM, Standard Oil), (120, Du Pont, GMC)] 


2 - Each of the following is intended to describe a binary re,a ‘ io " c t ° n n "“!'"® 
elements of the set U = (1, 2, 3, 4}. Write each relation by listing 
elements. 


(a) “is equal to” 

(b) “is less than” 

(c) “is not equal to” 

(d) “is a multiple of’ 

(e) “differs by at least two from” 

(0 “is equal to or greater than” 

3 * Let the universal set U be the collection of all subsets of the set P 
List the elements of each of the following relations. 

(a) “is a subset of” 

(b) “is equal to” 

Let R be the relation “is one less than” defined on the 
List the elements of R. 


h j n‘r< bv choosing th c 

^e\cr we have two sets we can form ordcrc P f r0 m the second 
»n a pair from the first set and the secon J the 

c collection of all pairs which can be formed 1 1 * A 3W | H (in 

'^Product* of the two sets. The Cartesian product ofse 

is denoted AXB. .„***»« 

" °«cartcs. a scvcntecnth-century Trench phil°'°Pl* r an 

|A „V HCtAT.ONS / 


3.1 


bin 



patT ' Ine^r (l0ne !; A) ’ JOneS recelved an A < h = from the 

pair (A, Jones), A received a grade of Jones 

From a inatheniatical point of view it is ncpfni « 

°t dered 

IS the set a of all pal fa v ‘ ™ f the re,at '°" ***•» than" 

pairs \pc,y) ot real numbers for whirh v ^ „ t. 
if and only ,f fa, y) e G , n set nota „ on wh,ch * > )’ Then at > y 

G ~ te y) x and y arc real numbers and x > j'} 

71,6 rda “°" ° r 

a dominates b, the relation of "father" as fhe s , r , Pa ‘ ,S ( °' b) SUch ,hat 

be'a previousiy'a^^j^^^t^ to tHe d^scnption of a relation there must 
bon refers If, f or example, „ **£*£>«• •» which this descrip- 

bro her of ’ contains the pair (Cesare I „ 11 h “ mans . the relation “is the 

■ft/ is the set of all males I„ g en era] a “^‘ la) ' b , ut “>'5 pair ,s not included 
Set U 1S 3 « «f °t<i=red pairs of elements ofl! re ' at ‘° n defined ^lative to a 

to” defined ontfasAe*? l °' ’’ 2> ’ then ,he reIal '°" "'s less than or equal 
= «°, °), (0, 1), <0, 2)> a , 1) ( , 2) (2 2)} ^ 

that a similar situation „ waning m a particular m deScnbe ,n general, 
Sion there was to allow an^t !" ° Ur dlscuss '°n of symbol^'l Y ° U Wdl recaI1 
implication, and so forth J t ° staten i=nts to be comh 'a ° 8 ‘ c 0ur deci- 
hav = made the same d«K, ^ ° r they w e ”se ’ by co "Jm«.o„, 
ordered pairs, regardless o h ' n a rdated 

77? Thus ’ rw rr ,h = «-*« zc 

wrar and Peace, 6j), m , . . , 

3lth ™ 8b a "f intuitive s7e!t! ^ ^ ^ 
Relations involving more than v ^ ^ ‘ S 1,01 rcadl, y 

50 / P E tATIO N s 


A relation R is said to be reflexive in (relative to) a set U if it is true for 
every xe.V that xRx , that is, that the pair ( x , x) is an element of R , or that 
each element x G U bears the relation R to itself The relation “is at least as 
tall as” is reflexive in the set of all people, every person being at least as tall 
as himself The relation 

G = {(1, 1), (liberty, liberty), (1,2), (3, 3)} 

is not reflexive in the set W = {1, 2, 3} since (2, 2) is not a member of G, but 
is reflexive in the set V = {liberty, 3} 

A relation R is symmetric in a set U if for all x and y in t/, it is true that 
yRx whenever xRy If U consists of a set of five people lined up for a picture, 
then the relation “stands next to” is symmetric since if A stands next to B, 
then B also stands next to A The relation “is the brother of” is not symmet- 
ric m the set of all humans, but is symmetric in the set of all males Sadly, the 
relation “loves” is not symmetric in the set of all people The relation G, 
above, is not symmetric m IV since (2, I) is not a member of G However, G 
is symmetric in V since neither (liberty, 3) nor (3, liberty) belongs to G 
A relation R is transitive in a set U if for all x, y, and z in U, it is true that 
whenever xRy and yRz, then xRz The relation < is transitive in the set of 
real numbers since x < y and y < z together imply x < z The relation 
“stands next to” is not transitive in a line of more than two people while the 
relation “is a sibling (brother or sister) of” is transitive in the set of all 
people The relation G above is transitive in both W and V 

PROBLEMS 

1 Classify the relations in Problem 2, Section 3 1 as to whether they are 
reflexive, symmetric, and transitive 

A relation which is reflexive, symmetric, and transitive is called an equiva- 
lence relation The most familiar example is the relation “equality” between 
real numbers Other equivalence relations are “is the same height as” and 
“has the same color hair as ” The relation “is a sibling of” is not an equiva- 
lence relation since it is not reflexive The relation < fails because it is not 
symmetric 

Example 1 Bavelas* in his work on small-group communication intro- 
duces a relation “touching” about which the following assumptions are 
made 

(a) The group consists of a collection of subgroups called cells 
We shall take the universal set U to be the set of all cells 

(b) A given cell may or may not be touching another cell 

(c) If cell Ci is touching C 2 , then C 2 is touching C\ 

(d) No cell touches itself 

•Bavelas A “A Mathematical Model for Group Structures,* Applied Anthropology 7, 
16-30 (1948) 


3.2 EQUIVALENCE RELATIONS / 63 



Example 3 KA = {a,b} and B = { 1 , 2 , 3 }, then 
A x B = {(«, 1), («■ ("■ 3 >- (i> ' 1)1 (6 > 2) ’ (6, 3 ® 

j X ^ {(1, a), (1, 10, (2, a), (2, «• 0. «)• 0> 

/t X A = {(a, a), (a, b ), (i>, n), (6, 6)) 

JXJ.«1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)} 
Note that AX B is not equal to BX A ► 


PRO B LEM S 

5. Let A =» B = {subject /t, subject B} , and C = {yes}. Write the 

following sets 

(a ) AX A (b )CXC If) AX B (6) D X A 

(e)dX(BUC) (() (A X B)U(A XC) (e) A X B X C 
6 Argue that a binary relation defined relative to a set V ts a subset of the 
Cartesian product 17 X U 
7. Let 1/ denote the universal set 

(a) Show that X = l/X 17 is a relation, called the universal relation 

(b) Show that 0 can be regarded as a relation, called the empty relation 

g. Given A = {1,2} and 8 = {2, A}, write the following relations in 
(subsets of) A X B 

(a) “is less than” 

(b) “is not equal to” 

(c) ‘ is at least as great as” 

(d) “is half as large as” 

9. In Problem 8, how many distinct relations can be defined m A X B 7 

10. Prove the distributive laws 

(a) /lX(BUC) = (/lXB)U(/tXC) 

(b) A x (8 n C) = (A X B) n {A x C) 

11 Let A and B be subsets of some universal set U Prove that (A X V) H 
(1/ X B) - A X B 

12. The concept of ordered pair may be defined m terms of sets by writing 
(a, 6) « {{a}, {a, 6}} Prove that with this definition (a, b) = (r.s) if 
and only if a — r and b — s 


3.2 EQUIVALENCE RELATIONS 

It is convenient to introduce letters to designate relations For instance, we 
indicate that c stands tn the relation of friend to b by writing cFb Sim- 
ilarly, xGy might be used to indicate that * is greater than y 


52 / RELATIONS 



Equivalence classes, then, either have no elements in common or are 
identical Every element of U lies in at least one equivalence class (namely, 
its own) but no element is a member of two different classes The distinct 
equivalence classes determined by E thus partition U in the sense of the fol- 
lowing definition 

Definition 1 A collection of sets A t constitutes a partition of a set U if 

(1) U ^ ^ » every element in U is a member of at least one A t 

(2) A t n Aj = 0 whenever i ^ j, no dement of U lies in more than one 

At ► 

PROBLEMS 

3 Write two partitions of the set U = {«, b, c, d e} 

4 Let U be the set of all U S Presidents and B be the relation “was born 
the same month as ” List the equivalence dasses of B in U 

We have seen that each equivalence relation E defined on a set U classifies 
the elements of U into various disjoint subsets forming a partition of U Con 
versely, any partition of U determines an equivalence relation E in U We 
simply take xEy if and only if x and y belong to the same set in the partition 
Classifications utilizing equivalence relations are widely used m the be- 
havioral sciences A sociologist may classify human beings according to the 
region of the country m which they reside and may then compare regions 
according to various attributes such as average income, amount of education, 
etc An economist may classify firms as distributing, manufacturing, and so 
forth 

Such classification is not always successful In grouping humans into 
generations a natural rule might be xGy if x and y were born within 25 years 
of each other However, this relation is not transitive, and hence is not an 
equivalence relation For instance, if x was born in 1920, y in 1935, and z in 
1955, we have xGy and yGz but not xGz We must look for another criterion 
for classification 

Example 2 Economists are concerned with consumer demand for com- 
modities As a means of explaining consumer demand, a preference relation 
P is defined on the set U of commodities by xPy if commodity x is preferred 
to commodity ; It is assumed that 

(a) P js not reflexive No commodity is preferred to itself 

(b) P is not symmetric If v is preferred to y then y is not preferred to x 

(c) P is transitive If x is preferred to y and y to z, then x is preferred to z 

Thus P is not an cqun alence relation ^ 


32 EQUIVALENCE RELATIONS / 55 



Property (c) states that “touching” (T) is symmetric while (d) implies that 

« ' - — “;r,:T;c “ 

required that we must have C x / C 3 wnencvcr 2 

“Touching” is not an equivalence relation 
Suppose, however, that we define another relation linked by 

Cells A and B are linked if either A = B or there exist cells Cj, C 2 , , 

c n such that A touches C u C\ touches C 2 , , C n _ , touches C„, and C n 

touches B 


The symmetry of “touches” guarantees that if A is linked to B, then B is 
linked to A Further, if A is linked to B and B to C, then A must be linked 
to C The relation “linked ’ is therefore an equivalence relation, being re- 
flexive, symmetric, and transitive ^ 


PROBLEMS 

2 Give details of the proofs of the statements in the last paragraph of 
Example 1 


Suppose that E is an equivalence relation defined on a set U and let x be 
an element of U The set of all those elements of U which have the relation 
E with x is called the equivalence class or x That is, the equivalence class of 
x is the set 

E x = {y&U yEx } 


Since E is reflexive we must have xEx, and thus each element of V is a 
member of its own equivalence class Moreover, the symmetry of E guaran- 
tees that if yEx, then xEy Hence if y is a member of the equivalence class 
E z , then x is in turn a member of E y In fact, if yEx, the equivalence classes 
E x and E v must be identical Let us suppose 1 is an element of E v Then 
zEy, and this together with yEx and the transitivity of E gives zEx or 2 e E x 
It follows that E v is a subset of E x A similar argument shows E x £ E y and 
hence that E v == E x We conclude that if xEy, then E x and E v are the same 
equivalence class Actually, the equality of E y and E x follows if they have 
any element in common For suppose z^E x c\E y (see Figure 3) Then 
zEx and zEy By symmetry zEx becomes xEz which together with zEy gives 
xEy This, as we have seen, implies E y = E x 



54 / RELATIONS 




14 For the rchtions A t B, and U, prove that if A C B then UA c VB for 
all V 

15 Jdempotency of a relation is defined as A A — A Prove that if A is an 
equivalence relation, then it must be an idempotent relation 


3 3 WEAK ORDERING RELATIONS AND 
PREFERENCE RELATIONS 

An equivalence relation provides the mathematical counterpart to the opera- 
tion of classifying a group of objects Equally important in the behavioral 
sciences is the problem of ordering a collection of objects 

Mathematically, orderings arise by omitting one or more of the three prop- 
erties possessed by equivalence relations For example, the relation < is 
reflexive and transitive, but not symmetric While it is possible that we have 
both x < y and y < x t it is not in general true that x < y implies y < x 
The relation < is transitive but not symmetric or reflexive However, in this 
case more can be said since x < x is always false and x < y implies that 
y < x is false 

A relation R for which xRx is always false is said to be irreflexive If R 
is such that xRy always implies that yRx is false, then R is called asymmetric 
Thus the relation < is transitive, asymmetric, and lrreflexive The relation < 
is reflexive and transitive, but neither symmetric nor asymmetric 

A relation R which is reflexive and transitive on a set U is called a weak 
ordering relation on U A relation R which is transitive and asymmetric on 
U is called a preference relation on U Thus the relation < is a weak ordering, 
while < is a preference relation In Example 2 of Section 3 2, P is a preference 
relation in this technical sense 

Example 1 Let U be the set of all families in a community and R be the 
relation “is socially more prominent than ” Then R is 

(1) irreflexive — no family is socially more prominent than itself 

(2) asymmetric — if family F is more prominent than family G, then G is 
not more prominent than F 

(3) transitive — if family F is more prominent than family G, and G is more 
prominent than H, then F is more prominent than H 

Hence R is a preference relation ► 

PROBLEMS 

1 Let A denote the relation ‘ is at least as socially prominent as ’* Prove 
that A is a weak ordering 

2 Prove that a relation which is asymmetric must also be irreflexive In 
particular, a preference relation is irreflexive 


33 WEAK ORDERING RELATIONS / 57 



PROBLEMS 

5. The .nd.ffere.ee relauon I * defined by x/y .f and only if x B not pre- 
ferred to y and y is not preferred to x That is, 

xly <==> ~xPy A ~yPx 

Show that / is an equivalence relation The relation / is called the cquiva- 
lence relation determined by the preference relation P What are the 
equivalence classes for l"* 

6 Determine which of the following are equivalence relations For those 
which are not, state which properties (R, S, or T) fail to hold 
(a) ‘‘votes for” (b) “«*ts more than ” 

(c) “is equivalent to” (d) “is at war with 

(e) “is biologically related to” (0 “causes" 

(g) “communicates with” (h) “has at least as much utility as 

(i) “influences” 


7. A subject is presented with a set U of 64 color patches differing m bright- 
ness He is requested to sort the patches according to the magnitude of 
the brightness differential between the patches and the standard Show 
that the relation “differs in brightness from the standard by the same 
amount as” induces a partition of U and is therefore an equivalence 
relation 


8 Let Ri be the equivalence relation “belongs to the same social group as" 
and /?2 the equivalence relation “belongs to the same religious group 
as” Let xRy mean xR\y A xRiy Prove that R is an equivalence 
relation 


9. We know that any equivalence relation partitions the universal set into 
equivalence classes In Problem 8, how are the equivalence classes for 
•K related to those for Ri and i? 2 7 

10 Let R i and R 2 be as defined in Problem 8 and let xQy be defined as 
V xR 2 y Show that Q need not be an equivalence relation 

11. The product of two relations A and B is defined by AB = {(*, y) there 
exists z such that xAz and zBy} Show that if A is an equivalence rela- 
tion, then AA C A 

U Suppose that an experimenter is studying the communication pattern of 
a task oriented group The communication network is so organized that 
for any two members, neither can contact the other directly AH com- 
munication between members is directed to the experimenter who func- 
tions as a relay IF A is the relation “can send a message to” and B is the 
relation “can receive a message from,” prove that AB is an equivalence 
relation 

13 o" AB ,tUS ' 1aniVSB where v • V - -4. and B are relations, then 


56 / RELATIONS 



That is 


a =>’<=> x < y and y < x 

Theorem 2 If W is a weak ordering relation and E is a relation defined by 
xEy <=> xWy A yWx ( 2 ) 

then E is an equivalence relation 

PROOF We must show that E is reflexive, symmetric, and transitive By (2), 
xEx if and only if xWx But x)Vx for all x since W is reflexive Hence the 
same is true of E 

The symmetry of E is immediate since xEy and yEx have the same defini- 
tion To show that E is transitive we must argue that xEy A yEz implies 
xEz By definition, xEy A yEz is equivalent to 

xWy A y\Vx A yWz A z\Vy 

Because W is transitive, the first and third conjunctive forms yield xWz, 
while the fourth and second give zWx Hence E is transitive ► 

Example 3 Continuing Example 2, let us define the indifference relation I 
by 

xly if neither xPy nor yPx 
It follows from the definitions of W and P that 
xly ^ xWy and yWx 

By Theorem 2 then, I is an equivalence relation, a fact previously verified 
directly in Problem 5 of Section 3 2 

The ordinal assumption of utility theory requires that u(x) > u(y) when- 
ever xWy Thus xly implies tt(x) = u(y) Equality of utility values reflects 
indifference between alternatives ^ 

PROBLEMS 

3. For each of the following relations, state whether it is (i) reflexive, (n) ir- 
reflexive, (in) symmetric, (iv) asymmetric, or (v) transitive 

(a) “negotiates with” 

(b) “is superior to” 

(c) “is the brother of’ in the set of all humans 

(d) “is the brother or* in the set of all males 

(e) “is the brother or* m the set of all females 
(0 “is as bright as” 

(g) “merges with” 

(h) “differs from” 

0) “is subsumed by” 


33 WEAK ORDERING RELATIONS / 59 



From any weak ordering relation it is 
preference relation For instance, from 
relation < by writing x < y if and onl> 


possible to obtain a corresponding 
the relation < we may obtain the 
if x < y and not y < * 


Theorem 1 If W is a weak ordering relation and P is a relation defined by 
xPyt=* (xll>) A (1) 


then P is a preference relation 

PROOF We must show that (1) means P is asymmetric and transitive First, 
xPy means xWy and not yWx, while yPx means yWx and not xWy The two 
statements are obviously contradictory, and it follows that P is asymmetric 
Transitivity of P follows from transit* tty of \V For, if xPy and J Fa, then, 
among other things, we have xWy and yWz Hence xWz But if we also had 
zWx then this, together with xWy, would imply zWy, contradicting y Pz 
We have proved xPz and thus P is transitive ► 

The relation P defined in (1) is called the preference relation determined 
by W 


Example 2 Behavioral scientists are becoming increasingly interested m 
the process by which an individual, when confronted with the necessity of 
making a decision chooses one course of action rather than another from a 
set of possible alternatives The alternatives may range from various bundles 
of commodities for the housewife to alternative strategies of international 
politics for national leaders 

The notions of preference relation and indifference relation (Example 2 
and Problem 5, of Section 3 2) apply here Thus the relation W defined by 
xWy if an individual either prefers x to y or is indifferent between the two is 
a weak ordering relation The relation P defined by xPy if * is preferred to y 
is the preference relation determined by W 
One way which is often used to formalize problems involving preference 
or choice is to assign to each alternative x a number u{x), called the utility 
of x A basic assumption of utility theory, the ordinal assumption , states that 
if an individual prefers x to y or is indifferent between them (that is, if xWy) 
then u(x) > u(j>) Based on the ordinal assumption and certain other as- 
sumptions * the notion of utility serves as a numerical characterization of 
preference and indifference w 


Equivalence relations as well as preference relations may be obtained 
from weak ordering relations For example, if W is the relation <, the rela 
tion E of equality is obtained by writing 


xEy <=* xWy A yWx 


•See for example the discussion m 
Decisions (John Wiley & Sons Inc 


Chapter 2 of Luce R D and Raiffa H Games and 
New York 1957) 


58 / RELATIONS 



3 4 PARTIAL AND SIMPLE ORDERINGS 


As we have seen, an equivalence relation E partitions the universal set V 
into disjoint subsets called equivalence classes The elements of a particular 
equivalence class are alike relative to E If, for instance, XJ is a set of people 
and E is the relation “has the same aptitude as,” an equivalence class contains 
all people having the same aptitude If aptitude is the only criterion upon 
which a person is to be selected, then members of the same equivalence class 
may be freely substituted for one another 
An equivalence relation E is said to be consistent with another relation S 
if elements in the same equivalence class (relative to E) may be substituted for 
each other in statements involving S A familiar example is the relation E 
of equality among numbers The statement that “equals may be substituted 
for equals” is a statement concerning the consistency of E with any other 
relation involving numbers 

Definition 2 An equivalence relation E and another relation S are said to 
be consistent if 

(xEy) A (ySz)=sxSz 
and 

(xEy) A (zSy)=* z Sx 

for all x, y , and z in U ► 

Example 1 Measurement models in the behavioral sciences generally in- 
clude transitivity as one of the postulates necessary for the construction of a 
measurement scale Intuitively speaking, if a is in some sense “greater than” 
b and b is in turn “greater than” c t then a should be “greater than” c 
Unfortunately, measurement relations which are defined by experimental 
procedures do not always exhibit the desired transitivity For instance, sup- 
pose that we wish to scale a set of tones with respect to loudness It is possible, 
of course, to measure mechanically (in decibels) the loudness of each tone and 
to arrange them according to the relations L, aLb means a is louder than b, 
and E , aEb means a and b are equally loud Assuming no measurement error, 
it is apparent that L is lrreflexive and transitive, and hence is a preference 
relation, while E is an equivalence relation 

More interesting to the behavioral scientist than the mechanical measure- 
ment of loudness is the set of responses given by a subject who is presented 
With pairs of tones and asked in each case to choose the louder one If tones 
*, y, and z are of almost equal loudness but differ m pitch, it is quite con- 
ceivable that the responses xLy , ylz* and zLx might be obtained, yielding an 
obvious violation of transitivity 

t One way of partially overcoming this difficulty is to present each pair of 
t6rres-a large number of times Then aLb whenever a is judged louder than b 
more times than b is judged louder than a , and aEb if each is judged to be 
louder in exactly half of the presentations 

PARTIAL AND SIMPLE ORDERINGS / 61 


34 



(j) ‘ is no worse than” 

(k) “is the same as” 

4 From the relations in Problem 3, select the weak ordering, the preference, 
and the equivalence relations 

5 Find the preference relation determined bj the following weak ordering 
relations 

(a) is no wealthier than” 

(b) is at least as liberal as * 

(c) is no more popular than” 

6 Let V be a domain of discourse Prove that 

(a) the relation / of logical implication is a weak ordering relation 

(b) the relation E of logical equivalence is an equivalence relation 

(c) the relation E is the equivalence relation determined by the weak 
ordering relation / 

7 Let U be a universal set Prove that 

(a) the relation S of subset is a weak ordering 

(b) the relation E of set equality is an equivalence relation 

(c) E is the equivalence relation determined by the weak ordering 5 

(d) What is the preference relation determined by S 9 

8 For each of the following relations R, find the weak ordering relation 
that determines R 

(a) xRy if schools x and y have the same racial balance 

(b) xRy if cities x and y have equal crime rates 

(c) xRy if decision x is equally as risky as decision y 

9 Prove that every equivalence relation is also a weak ordering relation 

10 Prove that a relation which is irrefiexive and transitive is a preference 
relation 

Problems 11 and 12 refer to Examples 2 and 3 

11 Prove that for each pair x, y of elements of U, exactly one of the state- 
ments xly, xPy, or yPx is true 

12 Prove that For all at, y, z e U, if xPy and ylz, then xPz. and if xPy and 
xlz, then zPy 

13 Complete the following table concerning the R S, and T properties of 
the set relations membership inclusion (subset), and equality 



ill 

s 

T 

Membership 

_noJ 


Inclusion 




Equality 


| yes 



SO / RELATIONS 



Suppose W is a weak ordering and E is the equivalence relation deter- 
mined by W If E is an identity relation, then W is called a partial ordering 
A good example is the ordering of the real numbers 
The intended interpretation of a weak ordering W is that it arranges the 
elements of U in a certain order We may think of xWy as expressing that.y 
is at least as far ahead in the order as x However, in using the ordering of 
the real numbers as an example of a weak ordering, one may easily be mis- 
led into thinking that every pair of elements in the universal set may be com- 
pared concerning their relative position That this is not the case for all weak 
orderings is shown m the next example 

Example 2 Let A be the set {a, b, c} and let U be the collection of all sub- 
sets of A The relation W defined by xWy if and only if * is a subset of y is 
easily shown to be reflexive and transitive and hence to be a weak ordering 
(Since the equivalence relation determined by W is the relation of equality, 
W is also a partial ordering ) Then if x — {c} and y = { b , c), we would have 
xWy and would say that y is at least as far ahead in the ordering as x: Con 
tinuing this analogy suggests that A itself should be at the top of the ordering 
and 0 at the bottom since for every x G U we have xWA (x c A) and 0fVx 
(0 £ x) However, some pairs are not capable of comparison For example, 
if x = { a , b} and y ~ {b y c}, we have neither xWy nor yWx 
Figure 4 shows the position m the order of each element of U If xfVy, we 
place y above x and connect them by a line Not all lines have been drawn in 
For instance, we have QWx for all x e U but only three of the seven possible 
lines emanating from 0 appear on the diagram However, from the diagram 
we read, for example, 0W{c} and {c} W{b, c} The transitivity of W then 
yields 0W{b t c} Other cases may be similarly obtained by “reading” transi- 
tivity into the diagram 

FIGURE 4 


A = \ab c} 



Two elements x and y of a set U are said to be comparable with respect lo 
an ordering relation O if either xOy or yOx Otherwise * and ; arenotcom- 

IMPIE ORDERINGS / 63 


34 PARTIAL AND S 



We arbitrarily require that E be reflexive And it is apparent fr< an, t the i way 
,n which the relations are defined that L is irreflexive and E W™*™- J h “‘ 
” not apparent, and indeed may fail to occur, is the transitivity of L and E. 
The determination of experimental conditions or subject types for whic 
this condition does or docs not hold is of particular interest 

We cannot pursue this situation further here* but will content oursel e 
with proving (in the case where transitivity obtains) that L and E are co • 

sistent relations . 

We must show first that aLb and bEc together imply aLc . It will be simpler 
to proceed indirectly and prove the equivalent contrapositive statement: 

~aLc implies either ~aLb or ~bEc 


From the way m which the relations L and E are defined, we must have 
either aLc or cLa or cEa Thus ~aLc is equivalent to cLa V cEa If cLa and 
aLb, then (transitivity of L) cLh from which we obtain ~cEb If cEa and 
bEc, then bEa, which implies ~aLb Since all cases have been considered, 
this completes the proof of the first criterion of consistency The second is 
established in a similar manner ^ 


PROBLEMS 

1. Complete the proof that L and E are consistent, that is, prove that aLb 
and aEc together imply that cLb 

2. Are the following relations consistent Why or why not 9 

(a) S "lives next to” and E “is in the same precinct as ” 

(b) D “pays the same dividend as” and E “is listed on the same exchange 
as ” 

(c) S “is shorter than” and E “is the same height as ” 

(d) F “is a friend of” and C ‘ lives in the same community as ” 

3. List three pairs of consistent relations 

4 The statement “L is irreflexive and E is symmetric” appears in the fifth 
paragraph of Example 1 Prove that this is a true statement 


It is easily seen that if E is an equivalence relation whose equivalence 
classes each contain only a single element and R is any relation, then E and 
R are consistent For, m this case, since each equivalence class contains only 
one element, we have xEy if and only if x = y The requirements of con- 
sistency are automatically satisfied when x ~ y 
An equivalence relation whose equ, valence classes consist of single ele- 
ments is called an identity relation Familiar examples are the relations of 
equality for numbers and for sets Roughly speaking, no two dtstinct ele- 
ments of the universal set are alike relative to an identity relation 


’al'leS m T™"? ‘"f'™ 1 ' 0 ” f comparisons are d.scussed 

NewYork" 5?f) ' ' of Scaling (John W,le * & Sons ’ Inc • 


62 / RELATIONS 



is shown in Figure 6 Given such a / scale, the individual’s I scale can be ob- 
tained as follows Consider the / scale to be hinged at the point X If we fo d 
the scale so that the part of the scale to the left of the hinge is merged wi 
the part to the right, we obtain Figure 7 It is now apparent that the / scale 
for this individual must be ABCDE, reflecting the relative neaniess of the 
stimuli positions to his ideal stimulus value X Any / 
regarded as a folded / scale Since the experimental data consist of obser ™° 
I scales, the / scale is recovered by unfolding the / scales, ence e r 
for the name of the scaling technique 

FIGURE 6 Joint continuum or J scale 

X , J 1 — 


FIGURE 7 

X 


Individual or /scale 


Consider all individuals whose ideal point 15 n '° C t a h t e 1 0 cat.o h n inhe 0 ^ 
stimulus position A on the J scale in Figure nositmns the / scales 

point determines the distance to the respective s i cqul va!ent for all 

cannot, ,n general, be regarded as being quant iU at, vel 5 
such individuals However, the preference or regarded as 

be the same for all such individuals Thus, the / scales can reg 

being qualitatively equivalent consl der individuals whose ideal 

In fact, the order will not change until between A and B For 

points are immediately to the right of ’ c ^ CDE In passing the midpoint 
these individuals, the qualitative I scale the / scale is BCADE 

between A and C, stimuli/* and C change P° sl ” , nt between A and C 
Hence the midpoint between A and 5 he , sca , P es are the same for any 
bound an interval on the J scale such th 

X within the interval , . the m ,d points between 

Continuing to move across the / scale, w sca | Ci as shown in Fig- 

all possible pairs of positions partition e . , j y ,b e same for all 

ure 8, into intervals such that the / scales i same , n tcnal 
lndniduals whose ideal point is located wi , /scales, it is necessary 

Before the /scale can be recovered by unfolding “ic with 5 stimulus 
to order the I scales From Figure S, we no c segment generates a 

Positions is partitioned into 11 segments Since eacli seg 



SRT.AL AMD SIMPLE ° 


nocniMOS I 



parable In Figure 4, for instance, sets {*> and {ft, 4 are comparable wh.le 

Thus the ordering re.a.ion **"££*«» 
bers ,s connected, wh.le the relation S for sets, cons.dered in Example 2, 

n °A C partial ordering which is connected .11. called a s.mple orders : of V 
Simple ordering is one of the most important ordering relation! > and ^the one 
which is usually intended when one speaks of ordering a nollechon o 
jects The most familiar example is the relation < for real numbers Th 
diagram of every simple ordering consists of a single vertical line connecting 
the elements of 17, as shown in Figure 5 This is because in a simple ordering 
all pairs of elements are comparable and no two elements are alike relative 
to the ordering 


FIGURE 5 



Example 3 A recurring problem in behavioral studies is that of assigning 
positions on some hypothetical scale which reflect the preference orderings 
of a group of subjects for a certain set of stimuli objects 
One theory of preference ordering is the “unfolding technique” developed 
by Coombs * In case (I) of Coombs’ theory each stimulus object j is assumed 
to have a fixed scale position Q : and each individual i is assumed to have 
chosen (perhaps subconsciously) a fixed scale position C, representing his 
ideal stimulus value The basic postulate which attempts to explain observed 
behavior is Given the choice between objects j and /c, a subject will respond 
jPk (object j is preferred to object k ) if and only if the scale value Q } is closer 
than the scale value Q k to the ideal value C, 

The usual restrictions imposed require that P be a simple ordering relation 
Hence, presentation of all pairs of stimuli will result in a rank ordering of the 
stimulus objects for a given individual Such a rank ordering is called an 
individual or / scale A scale on which both stimuli and individuals are as- 
signed positions is called a joint continuum or a / scale An example of a J 
scale with stimuli positions A, B, C, D, and E and a subject’s ideal value X 


’Coombs C H Psychological Scaling without a Unit of Measurement Psychological 
Renew 57, 145-153 (1950) For a more comprehensive description see Coombs C H 
A Theory of Data (John Wiley & Sons Inc New York 1964) 


64 / RELATIONS 



PROBLEMS 

5. Let xFy denote the relation “x voted in favor of all the bills that y did 
’ defined on the set U of all congressmen 

(a) Suppose that no two congressmen voted for exactly the same bills 

. Show that the relation F is a partial ordering 

(b) Consider the possibility that x and y may have voted for exactly 
the same bills Does F still give a partial ordering of U Give 
reason for your answer 

6 Let U be a set of machines Let T be the weak ordering relation for which 
% means x is at least as efficient and costs no more than y Show 

that* 

(a) T is not connected on the set of all machines 

(b) T is connected on a set of machines all of the same cost 

(c) T is connected on a set of machines all of the same efficiency 

7. Suppose that you wish to determine a subject, preffimnees for a^etj 

nected on the set U even if it is transitive - 

8 Suggest a modification of the experimental 

which will change the relation P into a simple ordering r 

9. Prove that the relations P and / as defined ,n Examples 2 and 3, 

3 3, are consistent . . _ 

10 Prove that if £ is the equivalence rdation determine^y^a JV, then P and 
relation IP, and P is the preference relation determined 

E are consistent ^reference relation 

11. Prove that any simple ordering is determine ^ ^ (f 

12 Referring to Example 1, define the relation S by 

xEy V xLy , , hat h an d E are, re- 

(a) Prove that S is a weak order,ng ' relations determined by S 
spectively, the preference and equivalence 

(b) Prove that S and E are consistent relation ^ ^ e(juiva , enc| , 

13 Prove that if W is any weak ordering , re a '‘° n c “ ns , stent Compare with 
relation determined by it, then an 

Problem 12 of Section 3 3 

3 5 GRAPHIC REPRESENTATION OF ORDER 
RELATIONS 


section, facilitate the ap- 

Diagrams, such as Figures 4 and 5 of the prece s [ n this section we 

Plication of the theory of relations to specific P ons w h, c h arc most 

shall consider some important notions concer 
easily presented in graphic form 



unique 1 scale, the J scale in our example yields a set of 1 1 different I scales 
All / scales must of necessity end with the first or last stimulus position on 
the J scale The I scale associated with interval / j establishes the simple order- 
ing of the stimuli on the J scale The remaining scales can be ordered accord- 
ing to the rule that a stimulus must first move to the left in an I scale before 
it can move to the right Adjacent stimuli in I scales simply change order 
when we move from one adjacent interval to another on the J scale 
Ordering of the I scales induces a simple ordering on the midpoints and 
as such provides information about relative distances between stimuli posi- 
tions on the J scale For example, the ordering of the I scales resulting from 
the J scale of Figure 6 as presented in Table 1 indicates that the midpoint 
BC precedes AE on the J scale This can only be the case if the distance be- 
tween A and B, denoted AB, is less than the distance CE between C and E 


TABLE 1 

Ordering of the / scales of Figure 8 

1 scale 

Arrangement 
of / scale 

Order of 
midpoints 

Relation of distances 

A 

ABODE 

AB 


1 2 

BACDE 

AC 


1 1 

BCADE 

AD 

AD precedes BC => AB > C£) 

u 

BCDAE 

BC 

BC precedes 4£=> CE > AB 

1 1 

CBDAE 

AE 

AE precedes BD =» AB >~DE 

u 

CBDEA 

BD 

AE precedes CD AC > ~DE 

h 

CDBEA 

CD 

CD precedes BE=*DE >~BC 

1, 

DCBEA 

BE 

h 

DCEBA 

CE 


ho 

DECBA 

DE 


hi 

EDCBA 




Information on the relative d, stances between stimuli positions on the J 
scale of M example ,s , llustrated by the partial ordering m Figure 9 The 

selv H 0f I*' ™ f ° IdmS procedure 'S that, given only the ordering of the ob- 
servable f scales, we can infer not only the stimul. ordering on the hvootheti- 

t ween tne stimuli positions on the scale ^ 


FIGURE 9 

CE largest distance 



BC, smallest distance 


66 / RELATIONS 



class In the most efficient setup there would be a single communication 
class containing all eight persons 

A simpler version of Figure 10 is shown in Figure 1 1, in which arrows are 
drawn only to indicate contacts between equivalence classes, it being under- 
stood that contact is always possible between members of the same com- 
munication class Note that {a} is ahead of each of the other classes in the 
ordering and that {c,d} is above {e,f,g,h} However, {*> is comparable 
with {a} but not comparable with either of the other two classes As an or- 
dering of equivalence classes, then, it follows that Figure 1 1 represen s a 
partial ordering but not a simple ordering 


FIGURE 11 





l efgh) 


The transition from an ordering of elements (Figure 10 ) to 
of equivalence classes (Figure 11 ) may be made or any ^ (he m _ 

The new relation W which orders the ^ ord er,ng relation 

duced ordering relation corresponding to W An 

IV is always a partial ordering To see why this is so, we note tattM t ■ . 
and £ 2 are two different equivalence classes determined by a weak ordering 
IV, exactly one of the following three cases must occur 

I There are no two elements x e £i and y e £2 xWy nor 

That is, for every such choice of x y = « m Flgure n 

y^x As an example, take £, = W and J ^ ^ ^ ^ 

II For every choice of two elements x e 1 ordering as 

That is, every element of £1 is at leastas f jead in tl {«./. * /,} 
any element of £2 * For example, le 1 

in Figure 11 and y e £, we have yWx 

HI For every choice of two elements x 1 as any eIement in 

That is, every element in E 2 1S * eas 

Of course, elements in the same equivalence class are equally far ahead in 
the ordering since xEy ^ x f y y A yWx 

S 55“ « -“a Z v= a - £1 

„ nn.PH.c ncPUHSENTAT'CNS / •» 



0ne ° f (see ttT*t°of employees of a 

diagram (see rigu ) * efficient communication network 

utility providing service for a city person m the organiza- 

« one which allows any person to contact »y vf *»n » ^ _ 

non but for practical reasons this may not be possible nor any iw 
ployees in the network one of the following four cases exists 

(1) Each can contact the other 

(2) The first can contact the second, but not conversely This might be 
the case for a repairman working in a home and the central dispa 

(3) The second can contact the first, but not conversely 

(4) Neither can contact the other This might be the case for two repair- 
men working m different parts of the city 


FIGURE 10 



Suppose the relation C is defined by xCy if and only if x can contact y 
directly or indirectly This relation is reflexive since anyone can contact him- 
self And C is transitive since if x can contact y and y in turn can contact z, 
then x, by using y as an intermediary, can contact z The relation C is thus 
a weak ordering 

The flow diagram of Figure 10 illustrates a possible contact structure for 
eight persons in the firm The arrows indicate the direction in which a con- 
tact can be made However, the diagram has been simplified by not showing 
all possible contacts For instance, each person can contact himself In 
addition, transitivity means, for example, that eCf and fCg together imply 
eCg Only the essential arrows are shown and refiexmty and transitivity 
must be read into the graph 

The equivalence relation E determined by C is such that xEy if and only 
if xCy A ) Cx , that is, xEy if and only if x and y can each contact the other 
We shall say that x and y can communicate if xEy The four equivalence 
(communication) classes for E are indicated by the circles m Figure 10 Two 
employees communicate when they are members of the same communication 


66 / RELATIONS 



On the basis of past experience, the following skill relations have been 
established 


dSj 

cSf 

aSh 

iSh 

fSe 

dSa 

gSa 

jSh 

jSb 

hSc 

fSt 

aSg 


This information is summarized in the now diagram of Figure 12, in which 
arrows indicate the direction of skill transference Again reflexivity and 
transitivity must be read into the graph 


FIGURE 12 

d 



Cham of — ''" k ‘" g a t n W .° ttenM^nrrt sufficient skill 


transference from one to the other On the ot er ^ members and 

indicates a bilateral transference of skill between a y demands 
these jobs can be regarded as being alike in term ' . shown in Figure 13 

The induced partial ordering of equivalence (hat , ob j requires 

tepresents the skill hierarchy for the 10 jo s demands 

the greatest skill whereas jobs b and e make I J lin ‘ as representing an 

It seems reasonable to interpret any simple required oC a single 

ordering of the jobs in terms of the relative a t j, eie appear to be 

*11 or a homogeneous group of skills simp ie ordering 

three separate skill classes, one represented y 

{d} — {]) -* w 

35 GRAPH ,C / 2, 



The fact that W places every member of an equivalence class in the same 
position relative to the ordering means that W may be considered as imposing 
the same ordering on the equivalence classes themselves It is apparent from 
I— III above that W is transitive and reflexive because W is Thus W is itself a 
weak ordering However, since no two equivalence Classes are “alike” rela- 
tive to W (if they were, they would be lumped together to form a single 
class), it follows that W is actually a partial ordering 


PROBLEMS 

1. Let C/ be the set of members of your class and define the relation W by 
x\Vy if x is at least as tall and has hair at least as dark as y List the 
members of each equivalence class determined by W Draw a flow dia 
gram for W and another for the induced ordering IV 


A glance at Figure 10 shows that each communication class forms a circle 
or a oop in the ordering around which contacts may be made successively 
Ihrough such a loop every person may contact every other 


Definition 3 The elements a u o 2 
relation R if n > 2 and 


> o n are said to form a loop of the 


OiRa 2 A a 2 Ra z A A a n _,Ra n A a n Ra x (3 ) 

A relation is loop free if no loop exists ^ 

relations “or “l ord^ToT T Cann °‘ « a11 “ P^nce 
Iron and that (3) holds for r Then ih» suppose p ls a Preference rela- 
the remaining statements bv the 6 f”! sta ‘™ ent >n (3) is OrPa 2 , while 
statements together violate the is ranS ' ° f F ’ ImpIy “e Po t These two 
be true for P Z * ^ C “ that < 3 > cannot 

w e S ha!e rly ' ,f 0 15 3 Parllal order,n g and (3) determ, nes a loop of O, then 

0\Oa 2 A a 2 Oai (4) 

■r'a“d onTy'.f' (4)Tolds n But' ro°" * dctC . rmmed b y 0 ™ch that ai Ea 2 
«. - ce Simihr argumen,r s L rh^r 1 ,^ 1 " 6 ' *« a * a " d only if 
0| ’ “»• . On must be the same ' ^ h °' dS ’ then a11 the efem ents 


lishmg^a sWIfemchy for'lO Se ™ e is interested in estab- 

relation 5 defined [f and L ^ ara "tested 

sulTice also Tor the performance of job '> It V a ™ rCqa,red to Perform job x 
Iransitne and hence a weak ordering a PParent that S is reflexive and 

70 / RELATIONS 



6 Prove that a weak ordering relation T defined on a set V is a weak order- 
ing on every subset of U 

7. Let W be a weak ordering relation and P the preference relation deter- 
mined by W An element * G U is termed “maximal” if there is no 
y e u for which yPx If at is the unique maximal element, it is termed 
“the maximum ” In a similar manner, if there is no y such that xPy, then 
x is termed “minimal”, and if x is unique, it is called “the minimum 
Prove that every weak ordering defined for finite f/has at least one maxi- 
mal element and at least one minimal element 


8 In the communication example illustrated by the flow diagram of Fig 
ure 10, prove that x is minimal if he can contact only members of his 
equivalence class 

9. Let T be a weak ordering of the elements of a set U, and 7 the corre- 
sponding induced ordering of the equivalence classes determined by 
Prove that if T is a simple ordering then every subset of U has both a 
maximum and a minimum relative to T 

10. According to Suppes and Zinnes* a relation P defined on a set U is a 
semiorder whenever the following axioms are satis e or 

in XJ 

(1) P is irreflexive, that is, xPx is always false 

(2) If xPy and uPo, then either xPti or uPy 

(3) If xPy and yPu, then either xPo or vPu 

(a) If an indifference relation 7 is defined as xly if and only if not xPy 
and not yPx, show that / is not an equivalence relation 

(b) If a relation E is defined as xEy if and only if for ■f'W 
if and only if ylu, show that £ is an equivalence relation 


11 In Problem 10, prove that 

(a) If xPu and xEy, then yPu 

(b) If uPx and xEy, then uPy „ . Hf»fi ne d 

(c) Given the results in (a) and (b), show how a relation Pea 

which will order the E equivalence classes 01 „ 

j Hpfine the relation R oy xKy 
n Let V be the set or all positive integers a dmded by the positive 

if x and y both give the same rema,n ]s modulo m ) Let the 

integer m (We express this by saying tha q der than >/ m 

relation S be defined by xSy if x/m has a greater rem 

(a) Show that R is an equivalence relation 

(b) Show that S is a preference relation 

(c) Show that R and S are consistent 

13 In Problem 12, prove that if xRy and -fbt, the 

(x + r) P 0 + a 7 ) ond lhrmal ical 

P . and Zinnes J L,‘ Basic Measurement Theory, m *"^ Vllcy & Sons, Inc , 
L ,)rt »'o«y, Luce, R D , Bush, R R , and Galanter, E , eds l 
w Y °'k. 1963), Vol I, Chapter 1 

HIC REPRESENTATIONS / TS 


3 5 G R AP 



FIGURE 13 



a second by {d] - {)} - {*> uf) {*}, and the third by 
{d) {a, g) -» {h, c, x,f) — » {e} 

One would suspect that those equivalence classes which are members of 
more than one simple ordering contain jobs which require several distinct 
types of skills ^ 

PROBLEMS 

2. The salary classification for an employee of the Burke Company is deter- 
mined by years of formal education and amount of experience Let 

U = {(16, 3), (14,1), (14,5), (12,1), (20,5), (14,3), (16,10), (12, 15), 
(12,3)} 

be the set of ordered pairs of numbers of years formal education and ex- 
perience, respectively, for 9 employees Draw the flow diagram for the 
relation T where xTy if and only if x has at least as many years formal 
education and at least as many years experience as y 

3. Redraw the flow diagram of Problem 2 under the assumption that 

(a) one year of experience is equivalent to 2 years of formal education 

(b) each year of experience is equivalent to \ year of formal education 

4 Explain why the rules m Problem 3 changed T from a partial ordering to 
a simple ordering 

5 Let V be a set of corporations State a rule for each of the following 
weak ordering relations which will change it into a simple ordering rela- 
tion on its equivalence classes 

(a) T is defined as * has at least as many employees as and no greater 
sales than " 

(b) T is defined as “has no more assets than and at least as good a profit 
margin as " 

(c) T is defined as * has at least as much net income, and no greater per 
share earnings, and no less dividends per share than ” 

72 / RELATIONS 



FUNCTIONS AND REAL NUMBERS 


41 FUNCTIONS 


The concept of function, or the dependence of one quantity upon ano ler, 
pervades social and behavioral science The economist attempts to escn e 
and to predict the manner in which economic processes change over ime 
Quantities, such as GNP, the prime interest rate, the national debt, 
forth, are thought of as functions whose values change as time passes 
sociologist, the growth of a population depends on many variables 
time, the present size of the population and migration rates 
From a mathematical point of view, a function is a specia > 
relation Specifically, a function is a set of ordered pairs in w 
pairs have the same first element Let us look at some examp es 


Example 1 Table 1 lists the number of known cases of rob ^ er ^ P er 
Population for selected states for the year 1965 This table contains 




K|S 


Rhode 

Island 

Illinois 

North 

Dakota 

Kansas 





Cases 

ol 

fobbery 

Per 

1 00 000 
.fetation 

19 

165 

1 




38 

113 

‘Source Sta 
P 151 

istical At 

stract of 

the United 



States (l 

s Department of Commerce. 1967). 


41 functions / ” 












Problem 13 may be used in the following way The numbers 10 and 2 both 
yield the same remainder (2) when divided by 8 Thus 100 = 10 • 10 yields 
the same remainder as 2 2 (that is, 4) when divided by 8 Similarly, 1000 - 
10 - * gwes the same remainder as 2 3 — 8 (that is, zero remainder) when 
divided by 8 


14. What is the remainder when 

(a) 1000 is divided by 7 9 

(b) 1,000,000 is divided by 8 9 

(c) 3 50 is divided by 13 1 

15. Suppose you have 12 objects, indistinguishable in outward appearance, 
of which 11 have the same weight, while the 12th is either heavier or 
lighter than the rest Let O x EO } denote that objects O, and O, are of 
equal weight, while O x HO 3 means that O, is heavier than O, 

(a) Argue that the relations E and H are consistent 

(b) Prove that three weighings on a simple balance scale are sufficient to 
determine the two equivalence classes of the relation E 


SUPPLEMENTARY READING 

Cogan, E I , Kemeny, J G , Norman, R Z , Snell, J L , and Thompson, G L , 
Modern Mathematical Methods and Models (Mathematical Association of Amer- 
ica, 1958), Vol 2, Unit II Order Relations 

S “PP3 p . Introduction to Logic (D Van Nostrand Company, Prrnceton, N J , 
1957), Chapter 10 

Suppes P , and Zinnes, J L , “Basic Measurement Theory,” rn Handbook of 
Mathematical Psychology, Luce, R D , Bush, R R , and Galanter, E , eds 
(John Wiley & Sons, Inc , New York, 1963), Vol I, Chapter 1 


7< / RELATIONS 



circle, then \/x 2 + y 2 = r or x 2 4- y 2 = r 2 A circle with radius 5 is a 
set of ordered pairs of real numbers and as such it constitutes a relation 

C= {( *,y ) x 2 + y 2 = 25} (D 

defined on the set R of real numbers Figure 1 (b) is called the graph of this 
relation Note that the relation C is not a function since it contains different 
pairs, for instance (4, 3) and (4, —3), having the same first elements 


When it is understood that the universal set is the set R of real numbers, 
we shall normally use the shorthand of writing only the description of a rela- 
tion rather than the relation itself For instance, in Figure 1(b) we speak of 
the relation x 2 + y 2 = 25 rather than writing the full statement (1) 


Example 5 The points ( x , y) on the horizontal line through the origin 
(called the horizontal axis or the x axis) all have the property that y — 0 
No other points have this property Hence this line is the graph of the re a 
tion y = 0 (see Figure 2) This relation is also a function since no two pairs 
have the same first element and different second elements 


FIGURE 2 



Similarly, the vertical line through the origin (the vert *|? 1 /passing one 

'he graph of the relat.on * = 0 , white the vertical of 

, mt to the right of the origin is the graph of the rela 10 
thi>c»...i-. _ ° ...u nil nairs nav 


these 

^ernei 


relations is a function, since in each case al pai 


have the same first 


mt, but different second elements 
line passing through the origin at an 


angle or 45° with the pos.mc 
mrougn tne origin ■» 4 b . llW j = x. that 

of the horizontal axis represents the graph o function ► 

"■ * the set of points E = {(*, >0 , = 4 T”' 5 rc,a "° n ' 


Su The following definition formalizes our discussion 
some useful terminology 

<1 


of functions and intro- 


FUNCTIONS / 77 



pairs of the form (x, y) where x is a state and y a number As such it con 
stitutes a binary relation This relation is a function since each state (fast 
element) appears only once Note that the same number may correspond to 
different states 


Example 2 (a) The relation 

D = {(Robert, James), (Sam, Teddy), (Robert, Teddy), (James, Sam)} 

representing the dominance relations among four men, (a, b) G D if a dom- 
inates b, is not a function since two pairs, (Robert, James) and (Robert, 
Teddy), have the same first element 
(b) The relation 

E = {(Robert, James), (Sam, Teddy), (James, Sam), (Teddy, James)} 

is a function The fact that the pairs (Robert, James) and (Teddy, James) 
have the same second element is of no consequence ► 


Example 3 The pair (x,}>) of real numbers x and y indicates the point 
lying x units to the right and y units abo ve the origin (0, 0) The distance D 
from (0, 0) to (x, y) is D = V* 2 + y 2 , computed using the Pythagorean 
Theorem for right t riangles (see Figure 1(a)) For instance, if x = 4 and 
y => 3, then D = V4 2 + 3 2 = 5 


FIGURE 1 




To ea c h po mt P (x,)>) in the plane, there corresponds a distance 

P Y * + T he set of aU pa,rs D > « a function since to each poini 

2 Vrr U "T a 513 ""' D Note ttat “ «>« case the fast el* 
ments P are themselves ordered pairs ^ 


fo™ m >w 4 i™' r 0 '"! 5 " hlCh a " the same dlstanc c r from the origin 
form the circle w ith radius r and center at (0, 0) If (x, y) is a point on tins 

76 / FUNCTIONS AND 


REAL NUMBERS 



PROBLEMS 


2. What is the value of L(/4 ,io)? of £(/_ 3 . 0 )? 

Example 8 The complement function C is defined on the subsets of a 
universal set U by 

C(A ) = A' 

The domain and range of C are the same, each being the collection of all 
subsets of U. ^ 


PROBLEMS 

3. Let U = {a,b,c,d}, A = {fl,c}, B = { b,c,d ), and D = {b,d}. Find 
C(A), C(B), C(A U B ), and C(A Cl B). What is the domain of the func- 
tion C? 


Example 9 The identity function I is defined on any set A by 
1(a) = a for each a eA 

That is, an identity function pairs every element with itself. The function 
y = x in Example 5 is the identity function on the set of real numbers A. 


Example 10 Let V be a given universal set and let n be the function which 
assigns to any subset A of V the number n(A) = number o e cnien s 
For instance, n({a, b, c}) = 3, n({7, 2}) = 2, and n(0) = 0- 


PROBLEMS 

L (a) Suppose the set U in Example 1 0 contains k elements. Prove that for 
any subset A of U, n(A ') = k — n(A). 

(b) Prove that if subsets .4 and B have no elements in common, that , 
if A n B = 9, then n(A U B) = n(A) + n(B). 

A geometrical way of viewing a function is shown [he corre . 

" nt c in the domain of /is mapped, or transforme , . j t f unc . 

^ponding element b = /(c) in the range of/ In thts wa of M « 
tl0ns * b is sometimes called the image of c relat.ve to the function; 

,Q URE 4 A function as a mapping 

Of f) 



members of elements of / is called the range of/and is denoted by *(/) * 


Example 6 (a) The domain of the function determined by Table 1 is the 
set of states 

(Rhode Island, Illinois, North Dakota, Kansas, Louisiana, Hawaii, 

1 Oklahoma, California) 


while the Tange is the set 

(5, 19,24, 38,51,113, 165) 

(b) The domain of the function y = 0 graphed in Figure 2 is the set R of 
all real numbers The range is the set {0} containing the single element zero 


PROBLEMS 

1 What are the domain and range of the function in Example 2(b) ^ Of the 
function ji-xm Example 

The essential character of a function / is that it associates with each ele- 
ment of a certain set, the domain D(f) of the function, an element of another 
set, the range R(f) of the function This way of looking at functions is so 
convenient in practice that we introduce the following special notation and 
terminology 

If the ordered pair (a, b) is a member of the function/, that is, if / asso- 
ciates the element b e R(f) with the element a G £>(/), then b is called 
the value off at the point a and we write b ~ f(a) From this point of view, 
a complete description of a function is obtained by indicating its domain 
and describing a method or rule for determining its values 


Example 7 Let I a b (6 > a) denote the interval {x a < x < b} of real 
numbers lying between a and b , inclusive (see Figure 3) Let L be the func- 
tion which assigns to each interval its length Then 

i(/a«.)= b~a 

The domain of £ is the collection of all intervals on the real line while the 
range of L is the set of all non negative real numbers ► 

FIGURE 3 


78 / FUNCTION 


S AND REAL NUMBERS 



8. Which of the following relations defined on the set R of all real numbers 
also qualify as functions 9 

(a) y = x 2 - x - 6 (b) y 2 - y = v + 4 

(c) y 2 — x 2 = 25 ( d)y 2 = x 4 

(e) 9x 2 + 25 y 2 = 225 (0 9v 2 - 16> 2 = 144 


9. The total cost C of a certain commodity is given in terms of the quantity 
Q as 

C =/(Q) = 3 - 02 Q + 01Q 2 

For the values Q = 1, 2, 3, 4, 5, calculate the corresponding values of C, 
plot the points ( Q , C ), and draw the graph of the function/ through the 
points 

10 Deutsch* has proposed the following index of government stability 

. g y 
S, = /(g,L,po!,y,yio) = >10 

where S, indicates stability, g is the ratio of government income to total 
national income, L is the percentage of literacy, “pol the percentage o 
political participation, y is the per capita national income, an >10 is 
the percentage of the total national income received by the op /o 
income receivers 

(a) Assuming that g = f , L = f, } - 2000, and > io = 5* compute S, 
when pol = rs, b i 

Cb) Solve the above equation for pol, thereby obtaining a formula which 
expresses pol as a function of g , L, St, > , and ) io 

It has been empirically demonstrated! that if M0 1S number of 
with population greater than i million, then 


where *. and p are constants (the symbol ~ mean PP fln m ex . 

The 1960 U S census lists 5 metropolitan areas "*P P m cTCCSS of 
rcss of 3 million and 24 metropolitan areas wilh popu nn j 

1 million Use these figures to compute estimates or the cons.an. 

P Then draw a curve representing the function 

>2- The following is a simple system of equations in macroccono 
variables* 



Still another way of visualizing a function is as a machine or system which 
accepts elements of D{f) as inputs and which produces corresponding ele- 
ments of R(f) as outputs (see Figure 5) If we insert an element a G D(f) 
into the system, the corresponding value f(a) comes out If another element 
c e D(f) is inserted, we obtain another (not necessarily different) value 
/(c) If we try to insert something not contained in the domain of f it is 
rejected, for/operates only on elements belonging to its domain Interpreting 
a function in this way makes clear the distinction between a function (the 
machine) and its values (outputs of the machine) A function should no more 
be confused with its values than a vending machine should be confused with 
a soft drink 

FIGURE 5 A function as a machine 



PROBLEMS 

5 Which of the following are functions 9 

(a) {(6 1), (5, 4) f (4 f ( 10j 

(b) ((Red, 10), (Green, 9) (Red, 8)} 

(c) (Jones, 1934} 

«> {(Corn, 340), (Beans, 160), (Bread, 25p)> 

6 ^ e ^ ne a Unction whose domain is the set ’ 

W = red P |a ys, I. softly, Iowa) 
and whose range is 


Hot the tVe ' b ' n0 “" a<iVerb ' Pr °"° Un ' adjeC “ Ve> 

draw a graph which passes thr ln following tables and then 

, '"rough the points 

’ 4 - 1 - 2-15 6 


(a) - 

(b) - 


H 13 15 


6 iTiTi nr 

' FUNCT '°"S and R EALNUMBER 




Let si and so be two situations from the universal set of situations 
that a person might encounter Let Si and S2 be the sets of aspects 
corresponding to situations ji and S 2 , respectively, and let n be the count- 
ing function defined in Example 10 
The distance between two situations and so is a function of the 
common aspects and is defined as 

D(s 1, Jo) = n[(Sl n Si) U (Si n S 2 )l 
= n(S 1 n si) + n(S'i n Si) 

= «[(Si u s 2 ) - (Si n S2)] 

= n(S, ) + n(S 2 ) - 2n(Si n S 2 ) 

Show that this distance function is a metric (Hint See Problem 14 ) 


12 ABSOLUTE value function 


are many special mathematical functions that will be of interest to us 
mon ° them is the absolute value function 


definition 2 The absolute value of a real number x, denoted |x|, is defined 

|x| = a: when x > 0 

|x| = 0 when x = 0 

|x| = —x when x < 0 


-S ,6 1 151 = 5, »|-S. |0| = 0, HI 1 *; 

" = i> and | — 5 | = -(-5) = 5 

'’fiiearh' 11141011 of absolute value determines a function ^ ' vb ^ o | u(e >a]uc 

fi'etn , real “umber another number A(x) = M> 03 _ g Note that 

m P h of the absolute value function A is shown in Figure 




6 Graph of the absolute value function 




where C is the money value of aggregate consumption, / is the money 
value of aggregate investment, and a, b, u, and v are real-valued con- 
stants Solve the above system of equations for Y, C, and / thereby 
representing these quantities as functions of the constants only 


13. Harrah* has a model of how a rational human receiver behaves in cer- 
tain communication situations In his model, Harrah postulates the 
existence of a semantic information function l such that if one declarative 
sentence in the receiver’s language L implies another, the / values of the 
two sentences are connected in a specified manner The statement “F 
is /.-true” is defined by Harrah to mean that F is a true statement ac- 
cording to the conventions of the language system L The statement "F 
L implies G” is defined to mean that (F Q is a theorem of L and 
similarly for “F and G are L equivalent” and “F is /.-false" The state- 
ment ‘F is /.-consistent” means that F is not /..false 
The information function / assigns real numbers to statements of L 
and is characterized by the following assumptions 

(1) 0 < 7(F) < co 

(2) /(F) - 0r=» F is /.-(rue 

(3) /(F) = oo F is an L false statement 

(4) If F /.-implies C, then /(F) > /(G) 

(5) The information of F, given C, is defined by 

KF | C) = /(F A G) - /(O 

From these assumptions, prove the follow, ng theorems 

. ! rr r ^ *" '-' Cilu,valenl . < h ™ KF) -■ /(G) 

( ) If F £. implies G, then /(F A G) = 1(F) 

(c) Z(F A G) > /(F) > /(F V G) 

(d) /(F | G) > 0 

(e) If FL implies G, then /(G 1 F) * 0 

™ If F £ ” ,m P 1 ‘« C. then /(F | H) > u 0 I m 

14 " e I"dF ,h ' fU " C, ' 0n «"«'-"Example 1 o Prove , ha, for any subsets 

15 Le, D( 1 H <A ° ^ " <A) + " <B) - < A n B) 

15. Let D(x, y) denote the distant 

tance function D is said to be a quant ‘ tl =s * and y f The dis- 

(0 D(x, x) = o f 11 sat,sfies four axioms 

00 D( x , y) > o 
(m) D( x , y) = D (y t x) 

(IV) »fc»H»(,,r ) > 0(li!) 

•Harrah, D , “A Model of q 1 

82 / FUMl'Tin... . . 


N CTI O NS AND 


Real numbers 



proofs It will, of course, be understood that the given property holds for 
x = 0 

Geometrically, absolute value may be interpreted as distance If x and 
are two points on a line and x < y (see Figure 7), the distance between 
and y is y — x In terms of absolute value, this distance may be written 
either as \y — x\ or as \x — y\ As long as we use absolute value, we can 
subtract either way and it is not necessary to know in advance which num- 
ber is larger (an application of Property 2) 

FIGURE 7 

j ! i 

0 x y 

Property 3 The absolute value of a product is equal to the product of 
absolute values In symbols, |xy| = |x[ |y| for all real numbers x and y 

PROOF With two numbers there are four possible cases 

Case I Let x > 0 and y > 0 In this case, xy > 0 also, so |x| = x, 
M = y, and \xy\ = xy Hence \xy\ = |x| \y\ 

Case II Let x > 0 and y < 0 Here xy < 0, so |x| = x, \y\ = —y, and 
\xy\ = — xy Since — xy = x(— y), again we have ]xy] = |x| \y\ 

Case III Let x < 0 and y > 0 This is like case (II) with x and y inter- 
changed 

Case IV Let x < 0 and y < 0 Here xy > 0, so |x| = — x, \y\ = — y, 
and \xy\ = xy Since xy = (— x)(— y), then \xy\ = |x| |>>| in 
this case also 

Since we have considered all possible cases, the proof of Property 3 is com- 
pleted ► 

Property 4 Whenever division is defined, the absolute value of a quotient 
is equal to the quotient of absolute values That is, if y 0, then 

kL-W 

IpI \y\ 

PROOF The proof is similar to that of Property 3 and is left as an ex- 
ercise ► 

Property 5 For each real number x, — |x| < x < |x] Equivalently, for 
any x we have both 

x < \x\ and — x < jxj 

proof If X > 0, then -x < 0 < x = |x| If x < 0, then x < 0 < 

- M ► 

42 ABSOLUTE VALUE FUNCTION / 85 





M is never negative, no matter what the value of x may be In fact, |xj = 0 
if and only if x = 0, otherwise \x\ is positive 

PROBLEMS 

1 Let x = 2 and y — — 1 Then compare 

(a) \x y\ with |jc| |>| 

(b) 1-1 with p| 

\y ! l>i 

(c) |jt| with j-*I 

(d) I* + > | with |jc| + \y\ 

( e > I* - y| with |*| - \ y \ 

Let x * -2 and j = -3 ln p ro blenv 1 

draw from ' Wha ‘ £CnCral conclus,on! d ° )0U 

The following list mnta. 

function which we shall lmportant properties of the absolute value 
and leave the rest to v USe rcc l uen tly We shall provide some of the proofs 
with your calculation* j n each case y° u should compare these results 
«ons and conclusions m Problems 1-3. 

Property 1 a u 1 

what the value of * a I" n , evcr ne 8ative That is, |*| > 0 no matter 

Phoor Thedeta A s ° W = Otfand only if x = 0 

° n whether the quamu" wb! bS °l Ute vaIue takes a different form depending 
lve Tl, us the proof 0 r „ absolulc value is desired is positive or nega- 
0,1 property must consider positive and negator 

^ = r so that We noIe from the definition that when * is 
M ™'™. Ittl = -* £ " pos f'™ If * ts negative, then -* is posttne 
P etes the proof ’ a8a,n 1*1 is positive The fact that |0[ = 0 com 

Property 2 . 

tlve h symbols \ x \ , Value ° f a number is the same as that of its nega 

PRQ0F As the p ro i f 0 J rea ' nUmbers;t 

Case 1 We have * > ro P ert y 1, we consider three possible cases 

88 "(— jc) _ ^ 0 Then by definition |x| = X and 

CMe 11 We have, <„ He " c = W = |-*| 

H~*r ,h,s case w - '* and 

Slnce u » obvious °' 11 aUt0raat ' Cally follows ‘hat 101 = 1-01 ► 

°’ ^ Sha11 he »«feth a o of absolute value will hold when 
84 / function 6 C0nsideratl0n of case (III) from our 

REal Nu mbers 



FIGURE 9 


in this interval jv-/>|<a 


p-a x p p+a 

Property 8 For all x, \fx? « jxj Note that \/x* is equal to x only when 
x is positive or zero’ If x < 0, then \/*2 = —x 

PROOF Recall that yft is defined for t > 0 as the non-negative number 
whose square is t (for example, \/4 = 2, \/9 = 3) Hence if x > 0, y/x* = x 
since x is non-negative and its square js x 2 Similarly, if x < 0, then yfx* = 
— x since — x is non negative and its square is x 2 In this case yfx? x since 
x is negative and y/xt is, by definition, positive But m either case, \/x * = |x] 

► 

The proofs of the four remaining properties are left as exercises 
Property 9 For all x, Jvj 2 = x 2 
Property 10 For all x and y , xy < |x| |y| 

Property 11 For all x and y, |x — y\ < |x| *f b| 

Property 12 For all x and y, \x — y| > ||x| — |y|| ► 

PROBLEMS 

4 Verify each of the 12 properties of absolute value using x = 4 and 
y- -3 

5 In each of the following cases, determine whether the relation “ = ** or 
“>” holds for the statement la[ |6| > \a + b\ (Property 6, the tn 
angle inequality) 

(a) a — 4, 6 = — 2 (b) a = —4, b = —2 

(c) fl = 4, 6 = 2 (d) a = 0, b = -2 

(e) a = 0, 6 = 0 

6 Determine whether the relation “ = ’ or “>’ holds for the statement 
[a - b\> ||a| - |6|J (Property 12) 

(a) a = 6, b ~ 2d for 9 — some constant 

(b) a ~ ~9, b ~ y/2, for 6 > 0 

(c) <7 = 4, 6=0 

(d) a»-U»-2 

(e) o = 3, b = -2 

7 Show that the inequality \a — b\ > \{a\ — [6|| is equivalent to the in- 
equality ]o6] > ab {Hint Square both sides ) 


42 ABSOLUTE VALUE FUNCTION / 87 




Property 6 The Triangle Inequality The absolute value of a sum is never 
greater than the sum of the respective absolute values Symbolically, 


I* + y\ < 1*1 + bl 

for all numbers x and y 


PROOF Again there are four possible cases 

Case I Let * > 0 and y > 0 Here * + y > 0, so |x + y\ « x + y - 
M + bl Thus equality holds 

Case II Let* > Oandy < 0 so that |*| = *and \y\ = -y Since* + y 
could be either positive or negative, there are two subcases to 
consider 


Subcase (1) Let * + y > 0 Then |* -f y\ — * + >>< |*| + 
bl by Property 5 Inequality may hold For instance, 
|3 + (-2)| = |1| = 1 < |3| + | — 2j = 5 
Subcase (2) Let * + y < o Then \x + y\ = _(* + y) and 
again using Property 5, we find 


I* + y\ = -(* + >>)=-* - y < | x] + 

Case IH Let * < 0 and y > 0 This case is similar to case (II) and is left 
as an exercise Inequality can hold in this case 

"" (< -?sV|"JSi2 s ' s ■ “• 

PROOF The proof is left as an exerase ^ 

be interpreted as^tedLtanre be'tw^ J*' ? ~ °!' the ™mber 1*1 may 

if and only if ac lies between a „d Th ‘ S d,stance ls less than a 

holds if we measure the distance from 0 ^ Flgure 8 ) A similar statement 
Point * (Figure 9) lies within d,sta„ ™ “ r , ther than zero The 

uce a ot P if and only if 


or, equivalently, 


I* - Pl < a 


P-a^xip+a 

FIGURE 8 


~a 


interval |*j £a 



86 / 


FUNCTIONS AND 


Real numbers 



Our discussion in Example 1 shows that 1 is an upper bound for / Ob- 
viously, any number u > 1 is also an upper bound But we have also argued 
that no number smaller than 1 can be an upper bound In this sense the num 
ber 1 is the smallest or least upper bound for the set / 

Definition 3 If C is a set of real numbers and b is a number such that 
( 1 ) b is an upper bound for C 
(n) if u is any upper bound for C, then b < u 
then b is called the least upper bound for C and is written b = lub C ► 

A set can have only one least upper bound For if b and d were both 
least upper bounds of a set C, then since b is a least upper bound and d is an 
upper bound we must have b < d (Property (n) above) For entirely similar 
reasons we also have d < b, and thus b = d 

PROBLEMS 

1 For each of the following sets, what is the maximum 7 What is the least 
upper bound 7 

(a) (1, 2, 3} (b) {x 0 <x< 1} 

(c) The union A = {x 0 < x < 1} U {.*: 1 < x < 2} 

It is a fact of great importance that every nonempty set of real numbers 
which has an upper bound has a least upper bound This seemingly obvious 
statement is, unfortunately, impossible to prove without a deeper study of 
the real numbers than we can indulge in here We shall simply accept it as 
an axiom * 

Least Upper Bound (LUB) Axiom Every nonempty set of real num- 
bers which is bounded above has a least upper bound ► 

PROBLEMS 

2 The concept of least upper bound includes the idea of a maximum 
Prove that if C has a maximum then max C — lub C (It is because 
some bounded sets do not have maxima that we have introduced the 
more general concept of least upper bound ) 

3 Prove that a set can have at most one maximum 


A number b is said to be a lower bound for a set C if b < x for every x E.C 
The concept of lower bound is analagous to that of upper bound If b is a 
lower bound which is a member of C, then b is called the minimum of C and 

•Tor a discussion of the principles involved see Bartle, R G , The Elements of Rea l Analy- 
sis (John Wiley <5L Sons Inc New York, 1964) Sections 4 6 More extensile discussions 
may be found in the references which he cites 


43 BOUNDED SETS OF REAL NUMBERS / 89 



8 Show that the triangle inequality 

M4 1)1 > I* + Jl 

can he restated equivaltnlly as 

VJt + \fy~- > Vi* + ))• 

9 Show that if xi > 0 then x> is at least as big as the smaller of *- and) - 

10 Solve the inequality |t - 3) < 6 Sketch the solution set as in Figure 9 
(Hint Use Property 7 ) 

11 Solve the inequality |1 — 2v| < 1 Sketch the solution set 

12 Solve the inequality \2x + 3) < 13 Sketch the solution set 

13 Prove that if y > 0 then (xl > ) if and only if either x < — > or x > > 
Illustrate with a sketch 

14 Interpret Problems 10-13 in terms of distance 

15 Prove Properties 4, 7, 9, 10, 11, and 12 for absolute value In each case 
give an exampte where equality holds and, if possible, another example 
m which inequality holds 


4 3 BOUNDED SETS OF REAL NUMBERS 


In this section we shall discuss some basic concepts concerning sets of real 
numbers Some of the ideas which we want to introduce are already familiar 
to you A typical case is the concept of the maximum , or largest element, of 
a set When we say that m is the maximum of a set S, written m = max S, 
we mean that 


and 


m > x for every element xGS 


m ~ y for some element yeS 


For instance, for the set S = {7, 3, 9, -6, 2), max S = 9 


Example 1 Not every set of real numbers has a maximum Consider, for 
instance, the set / = (x 0 < x < 1} of all real numbers lying between zero 
and one The number 1 is not a maximum, for while it is true that 1 > * 
for every x G I, it is not true that 1 is an element of / On the other hand, no 
member of I can be a maximum For if m <E J, then 0 < m < 1 and it fol- 
lows that m < (m -f- l)/2 < 1 That is, the number (m + l)/2 is an ele- 
ment of 1 which is greater than m We conclude that / has no maximum ► 


If B is a set of real numbers and u is a number such that u > x for every 
x & B* then u is called an upper bound for B A set which has an upper bound 
is said to be bounded above 


88 / FUNCTIONS AND REAL NUMBERS 



Theorem la If r is any positive number, no matter how small, there 
exists a positive integer n such that 1 fti < r 

PROOF This is merely a rephrasing of Theorem 1 For if r > 0, then 
Ifr > 0 also Choosing n > \/r yields 1 fix < r ► 

Theorem la says essentially that there is no such thing as an arbitrarily 
small positive number No matter how small a positive number we may 
choose, there is a number of the form \/n which is smaller 

Example 3 Let D = {I, £, } =* {I fn n is a positive integer} 

Since 1 e D and I > 1/n for each positive integer/!, we see that 1 = max D = 
Iub D Zero is a lower bound for D since 0 < \fn for all n But if r > 0, 
Theorem la shows that there exists I fn S D such that l/n < r Hence r is 
not a lower bound and gib D = 0 There is no minimum element m D ► 

PROBLEMS 

6 State precisely, using proper notation, what is meant by "b is not a lower 
bound for the set A ” 

7 Show that a set cannot have two distinct upper bounds both of which 
belong to the set 

8 Prove that a nonempty set has at most one gib 

9 Given the set A ~ {§, f, 7 , if> , 2n/(2rt ~F 1), } 

(a) Find the gib and lub of this set 

/b) He = lub A, find a member of the set which is .greater than c — 

10 Let A be a nonempty set and define the set B by B = (x —x €= A} 
Prove that c = lub A if and only if —c ~ gib B 

11 Prove that if B is a bounded set of real numbers and if A is a nonempty 
subset of B, then 

gib A > gib B and lub A < lub B 

12 Prove that if b £ H and b is an upper bound for H, then b =* lub H 


SUPPLEMENTARY READING 

Anderson, K W , and Hall, D W , Sets, Sequences and Mappings (John Wiley & 
Sons, Inc , New York, 1963), Chapter 1 

Good, R A , Introduction to Mathematics (Harcourt, Brace & World, Inc , New 
York, 1966), Chapters 9 and 10 


SUPPLEMENTARY READING / 91 



we write » ~ mm C The number b is the pcates, loner bound of C, b = 
gib C, if 

( 1 ) b is a lower bound for C 
(n) if d I s any lower bound for C, then b > d 
A sens bounded below if it has a lower bound A set which has both an upper 
bound and a lower bound is said to be bounded 

Example 2 Zero is the greatest lower bound of the set 

/ = {* 0 < x < 1} 

Firstly, it is obvious from the definition of 1 that 0 < x for every * 6 / 
Secondly, no lower bound can be greater than zero For if 6 > 0, then 

0 < 6/2 < b That is, i/2 e I and i/2 < i Hence b is not a lower bound 

1 has no minimum However, it is bounded since it has an upper bound, 1, 

and a lower bound, 0 * 

PROBLEMS 

4 Rework Problems 2 and 3 with the word maximum replaced by minimum 
and least upper bound replaced by greatest lower bound 

One of the most important results implied by the LUB Axiom is known 
as the Archimedean property of the real numbers 

Theorem 3 The Archimedean Property Let r be any positive real num- 
ber Then there exists a positive integer n such that n > r 

PROOF An indirect proof is best Suppose that the conclusion is false 
That is, suppose that for every positive integer n we have r > n Then the 
number r is an upper bound for the set N = {1, 2, 3, 4, } of positive in- 

tegers It follows from the LUB Axiom that N has a least upper bound b 
Now if n is a positive integer, then so is n + 1 Hence b > n + 1 for all 
Ji £ iV or, equivalently, b - 1 > n for all n But this means that 6 - 1 is 
an upper bound for N, an impossibility since b — 1 < b and b is the least 
upper bound 

We arrived at this contradiction by assuming that N was bounded This 
supposition must be false and hence the theorem is true ► 

PROBLEMS 

5 Since N, the set of positive integers is unbounded it follows that any 
set (such as the set R of real numbers) of which TV is a subset must also 
be unbounded Prove this 

An alternative version of the Archimedean property which we shall find 
useful is the following theorem 

90 / functions and real numbers 



The remaining distance is now one unit, and stepping half this distance 
brings us to position P 2 = f Continuing in this fashion brings us succes- 
sively to P 3 = J, P4 — P s — Ye, Pc, = In this way we obtain 

the sequence 

(P n ) = (P U P 2 ,P 3t )= ) 

It is easy to verify that in general the nth position will be 

P„=2 forn= 1,2,3, (1) 

2"-' 

For instance, P 3 = J = 2 - (1/2 2 ) and P 5 = fi = 2 - (1/2 4 ) ► 

One common way of indicating a sequence is to list the first few values and 
to use dots to indicate that the sequence continues according to the pattern 
thereby established For instance, the set E of positive even integers might 
be indicated by E = (2, 4, 6, 8, ) This notation has the slight disad- 

vantage that one can never be absolutely certain what the pattern is The 
notation (2, 4, 6, 8, ) also corresponds, for instance, to the first four 

values of the sequence given by the formula 

S{n) = 2 n + (n- l)(n - 2 )(#i - 3)(n - 4) (2) 

whose next value is S(5 ) — 10 + (5 — 1)(5 — 2)(5 — 3)(5 — 4) = 34, 
rather than 10 However, as a practical matter, the use of a notation like 
(2, 4, 6, 8, ) rarely, if ever, causes confusion Everyone “knows” that 

the sequence of even integers is intended Hence we will use such notation as 
a matter of convenience, writing a defining relation such as (2) in less obvious 
cases 

Specifying the values of a sequence by a formula, as in (1) and (2), allows 
no ambiguity as to what the sequence is The same clarity is obtained if the 
sequence is specified by a recursive formula That is, we may specify the first 
value a i and give a rule for finding a n ( n > 2) once a n _ i is known Or, more 
generally, we may specify a x and give a rule for obtaining a„ from a j, a 2 , 

, a n _ i For example, the sequence E ~ (a it a 2 , a 3 , ) of even integers 

is given recursively either by 

a x = 2 and a n ~ + 2, n > 2 

or by 

a x = 2 and a n = n n _i + a x , n > 2 

Of course, the simplest and most natural way to specify E is to write a n — 2 n 
for each integer n € iV 

Example 2 When money is borrowed at simple interest, the amount of 
interest which accrues is the same for each time unit of given length (say a 
year), and is determined by multiplying the per time unit interest rate t by 


61 BASIC CONCEPTS / 93 



SEQUENCES H 


51 BASIC CONCEPTS 

A function whose domain is the set N ~ fl ? t \ _r * 

: sr zzvs 

must be a set hke e " tS “ tee pa,rS ' 11 foll ™ s ‘hat u sequence 


1 {(1,0,), (2, a 2 ) (3, o 3 ), (4, a ,\ 


} 


of the sequence 3 is*the ' hat d ° mam 

s = («i, oj, a 3 ) or simply S = („ n ) 

the <•“““* «. » the 

is used to denote a sequrace m ordef.o" Z C >• ““her than { }, 

only with the set of values of the ep m Phasize that we are concerned not 

they appear ° f the b “t also with the order in which 

and move toward the other end'ina suet 6 "' 1 ° f “ Ime that 15 two unlts Ion S 
half the remaining distance From our ° rsteps ’ each of whlch covers 
ure 1, we step half the total distance to arrive at Zer °’ Sh ° W " “ F ‘ B ' 

lve at the first new position P 1 = 1 

FIGURE 1 


92 / SEQUENCES 



bounded below if it has a lower bound, and bounded if it is bounded above 
and bounded below 

The sequence (I, £, |, ) of Example 1 is bounded In fact, 0 is a lower 

bound and 2 an upper bound for this sequence Similarly H = (I, ) 

is bounded below by 0 and above by 1 and so H is bounded The sequence 
E = (2, 4, 6, ) of even integers is bounded below by zero but is not 

bounded above The sequence (1, — 1, 2, — 2, 3, — 3, ) is not bounded 
either above or below 

PROBLEMS 

4 For the sequence (a n ) in Problem 3 

(a) is 2 an upper bound 9 

(b) is 0 a lower bound 9 

(c) is | an upper bound 9 


If A - ( a n ) is a sequence and if ri < r 2 < r 3 < < r n < is a 

strictly increasing sequence of positive integers, then the sequence 

B ~ (pT\t U r3 , ) 

is called a subsequence of A Thus, for example, the sequences E = 
(2, 4, 6, 8, ) and (1,3, 5,7, ) are both subsequences of the 

sequence N = (1, 2, 3, 4, ) of positive integers One possible subse- 

quence of the sequence (I, — I, 1, — 1*1, — I, ) is the constant sequence 
(1, I, 1, I, ) obtained by choosing every other term beginning with the 
first The sequence (a 3 , a 4 , a 5 , o 6 , ) is a subsequence of A = 

(aj, a 2 , o 3 , a 4 , ) In fact any sequence obtained from A by choosing 

elements a n so that the subscripts are in ascending order is a subsequence 
of A 

problems 

5 One subsequence of H — (1, ) is the sequence A = 

y, ) Write another subsequence of If Write a subsequence of A Is 
every subsequence of A also a subsequence of // 9 


If A = ( a n ) and B = ( b n ) arc two sequences, then the sum S = A -f- B 
of A and B is the sequence whose dements are, respectively, the sums of the 
corresponding elements of A and B That is, S’ = (s n ) is the sum of the se- 
quences A and B if for each n, *= a n + b„ 

Similar!) , the difference D ~ A - B is the sequence D « (d n ) defined b) 
d* a n - b „ The product P = A B is the sequence P = (p„) defined by 
/>„ « o* If no clement is zero, the quotient of ^ and /? is the sequence 
Q “ (q n ) ** A/B defined b> 


B 1 BASIC CONCEPTS / 85 



the original amount borrowed P Hence if A. is 
paid after n time periods, the sequence - 
mined by the recursive formula 


the total amount to be re- 
(/*„/! 2 , Ho, ) is deter- 


A, = P + iP 


and, for n > 2, 


A n = H n _i + iP 


The first four terms of the sequence are 
A, = P + iP 
A 2 = Ai + iP = P + 2>P 
= A 2 + iP = P + 3 iP 
A t = At + iP = P + 4i/’ 

It is apparent that this pattern will continue and that for any n &N, 
A* = P + niP = P(1 + in) 


PBOBLEMS 

1 If $100 is borrowed at 5% simple interest for three years, what is the 
final amount due 7 

A sequence ( a n ) is said to be increasing if a n > o n _ x for each positive in- 
teger n The sequences in Examples 1 and 2, the sequence E of even integers, 
and the sequence (0, 0 1, 1, 2, 2, 3, 3, ) are all increasing A sequence (a n ) 

is decreasing if a n < a n _i for each integer n e JV An example is the har- 
monic sequence 



The constant sequence (3, 3, 3, 3, ) is both increasing and decreasing 

PROBLEMS 

2 Give two examples or increasing sequences and two of decreasing 
sequences 

3 Is the sequence (a„) defined by 

ai = 1 

a „ = £(2 - n > 2 

increasing 7 decreasing 7 

A real number u is called an upper bound for the sequence (a n ) if a* < u 
for every n Similarly, the number c is a lower bound for (a n ) if c < a n for 
every n We say that a sequence is bounded above if it has an upper bound, 

94 / SEQUENCES 



(c) If the trainer were to continue the training trials indefinitely, how 
many dogs would he expect never to be conditioned 7 

10 The Mugivit Company wishes to increase its sales revenue by 3% per 
quarter If S t is the sales revenue in quarter / and S f __ i is the sales reve 
nue in the immediately preceding quarter, then the quarterly growth rate 
G (in percent) of the sales revenue is given by 



Given that Si = SI 00 000 calculate the required sales revenue needed to 
meet the growth requirements m the (a) second, (b) third, (c) fourth, 
(d) nth quarters Is the sequence bounded below (above) 7 

11 Prove that every subsequence of an increasing sequence is increasing 

12 Prove in general that every increasing sequence has a lower bound 

13 Prove that an increasing sequence is bounded if it has an upper bound 

14 Prove that a sequence (p„) is bounded if there exists a non negative num 
ber M such that |p B | < M for all n G N 

15 Show that if (/>„) and (^«) are bounded sequences, then the sum (.y„) = 

( p „ ) *f {g„) is also a bounded sequence 


5 2 SEQUENCES OF STATEMENTS— 

MATHEMATICAL INDUCTION 

The forms of valid argument introduced in Chapter 2 all involve only a 
finite number of statements usually two or three In this section we present 
a form of argument, known as mathematical induction, which enables us to 
establish the truth of an infinite sequence of statements Our procedure is 
based on the following property of the positive integers 

Well Ordering Principle Let 5 be a nonempty set of positive integers 
Then S has a smallest element ► 

For our purposes we shall accept this statement as an axiom, albeit a per- 
fectly natural one Of course every nonempty set of positive integers con- 
tains a minimum' How could it be otherwise 7 Notice, however, that this 
innocent looking statement would be false if the set N of positive integers 
were replaced by the set Q of rational numbers (fractions) or the set R of 
reals In Section 4 3 we saw numerous examples of sets which, even though 
bounded below, did not have a minimum An example is the set of all real 
numbers which are larger than 2 

Fxomplo 1 The union 0 consists of those elements which arc con- 
tained in at least one of the sets of the sequence (A i >A~,A 3 , ) Suppose 

5 2 MATHEMATICAL INDUCTION / 87 



Example 3 If A and B are the sequences A — (1, 3, 5, 7, 9, 1 and 
= -Oth™ 

A + B~ (2, 1,4*,^, 

A-B= (0,J,4f, 

A ' B = (l, -J, J, -l J, _y, . . .) 

A/B = (1, -6, 15, -28, 45, -66, . . .) 

Similarly, if C = (1, 0, -1, 1, 0, - 1,1, 0, - 1, . . .), we have defined A + C, 
A C, and A C, but A/C is not defined since some of the elements of C 


PROBLEMS 

6. Write out the first four terms of the sequences for which the nth term is 

' ' " - - 1 


(a) 


„ 2 + 2 (bH-i)^ — , ) " ( „ + 2) (c)l+. 


(d) 


2 "+ 1 


(e) n I 2 - In 


7 - wme — and 

8 ' L^afoTo'whfch'Jdlor'oT 5 “I *7“““ <W are given Write a 
" h ' Ch Wl “ produce ,he values given for n - 1 , 2 , 3 , and 4 
,v 1 +9 4 4- Q Q^n „ 


(a)i±i, i±i, L+? 16 + 9 

32 4 2 52 ’ 

, __ 2^ 2* 

’ 5 2 ’ 5 3 * ’ 51 

(c) -1 + V3, 0 + VJ, t + V5, 2 + V6 

(d) 4-V5, S _Vg, 12 _VT3,,6-V20 


2 ? 2 

{b) r 


(e) li_? , 8 + 2 ^ 12 + 2 


I 

the sound of a burre^BejM'Sa 0 * 50 d ° 8S *° Stand on the,r heads at 
that the average number of does who ° retical benl » he has hypothesized 
of the Ath trial is are con ditioned at the completion 

K(.k) = 50 - 50(1 - 0)* 
where 6 is a learning parameter, o < e < i 
(a) Given 6 = 05, how many dogs on the * 

pect to be conditioned to the bu7 7 er , ^ erage would the trainer ex- 
00 second. On) third, ( lv ) nth tnaP 6 CompIetlon of the 0) first 

<b) upper ES* “ above (below, v If s0 , lndlcate an 

/ SEQUENCES 



Theorem 1 Let P be a set of positive integers such that 

(a) P contains the integer 1 . 

(b) if n is any member of P, then n + 1 is also a member of P. 

Then P is the set of all positive integers. 

PROOF An indirect proof is best. Thus, suppose that conditions (a) and 
(b) hold, but that P does not contain all positive integers. Then the set V of 
positive integers not in P is nonempty, and by the Well Ordering Principle 
has a minimum m. Note that m > 1 since 1 is in P. Since m is the smallest 
element of V,m — 1 cannot be in V, but must be in P. If so, it follows from 
(b) (taking n = m — 1) that (m — 1) -V 1 = m is also in P, contradicting 
the fact that m G V. ► 

The above discussion serves as a preliminary to the following statement, 
the main result of this section, which represents one of the most important 
of mathematical tools. 

Principle of Mathematical Induction Let Su S 2 , .S 3 , ... be a se- 
quence of statements for which 

(a) Si is true. 

(b) for every n, if S n is true, then so is S n +i- 
Then all the statements Si, S 2 , S 3 , . . . are true. 

PROOF The proof follows immediately since the set T, containing those 
integers n for which the corresponding statement .S’* is true, satisfies con- 
ditions (a) and (b) of Theorem 1. 

Intuitively, the induction principle is like the game with dominos (illus- 
trated in Figure 3) that we all played as children. We first set the dominos 
on end so that if one fell it would hit and knock over the next one (Property 
(b) above). With this setup, the only action required to topple all the dominos 
is to knock over the first one (Property (a)). Obviously both properties, (a) 
and (b), are necessary to ensure that all dominos fall. If either is omitted, wc 
have no guarantee whatever concerning the outcome. 



5.2 MATHEMATICAL INDUCTION / 93 



SmSS“ SSSS 

larger disk is nc\er placed on top of a smaller one 


FIGURE 4 

j4 

Obviously, with one disk only one move is required With two disks we 
first shift the smaller one to one of the empty pegs, then mo\e the larger one 
to the remaining empty peg and finally place the smaller one on top of t he 
larger one— a total of three moves A little experimenting shows that three 
disks require 7 moves, four require 15 moves, five require 31 moves, and so 
forth Table 1, in which Af» denotes the number of moves required to shift 
n disks, summarizes these results 



TABLE 1 


” 1 

1 

2 

3 

4 

5 

Mr, 

1 

3 

7 

15 

3, 


In each case (or for each n) the value of M n is given by the formula 
M n * 2 n - 1 

and it is natural to wonder whether this holds for all positive integers n 
To check this conjecture, we introduce the sequence of statements 
(Su S 2 , S s , ) where S„ is the statement “M n = T — 1 ” 

We know already that S x is true, as well as S 2 , S 3 , S* and S 5 Now sup- 
pose that for some value of n, S n is true That is, suppose that the number 
of moves needed to shift n disks really is 2 n ^ 1 Then the moves required to 
shift n + 1 disks may be calculated as follows 

(i) Move the top n disks to another peg This requires (by hypothesis) 
2 n — 1 moves 

O') Move the last disk to tile remaining empty peg This requires one 
move 

(ui) Move the n disks on top of the last This requires 2 n - l moves 
The total number of moves required is 

(2 n - 1) + 1 + (2* - I) « 2 2" — 1 = 2" +1 - 1 


100 / SEQUENCES 




Thus, we have shown that if M n = 2” — 1, then M n+i = 2 n+1 — I. That 
is, if statement S n is true, then so is *S' n+1 By the Principle of Mathematical 
Induction, then, all the statements S lf S 2 , S 3 , . . . are true. 


PROBLEMS 

1. Suppose we impose the restriction that a disk can be moved from one 
peg only to an adjacent peg Prove that to move n disks from one end peg 
to the other now requires 3 n — 1 moves 

Example 3 Devletoglou* used a variant of the so-called cobweb model 
as a means of studying the effect of public prediction on economic stability. 
We shall consider only the basic model which assumes 

(i) D t = a + bP t 
(n) S, = c + dP t - 1 
(m) D t = Si 

where D t represents the quantity demanded, S t the quantity supplied, and 
Pt the price during the ith time period, and a , b, c, and d are constants with 
b < 0 and d > 0 In effect, the model postulates linear demand and supply 
curves, with demand at time t being a function of the price at time t, and 
supply at time t being a function of the price in the immediately preceding 
time period 

From (i) and (in) we find 



which, when substituted into (n) gives 

, ,rs,_, - a] d„ , be - ad 
S,= C + d[— * 

Writing a = d/h and 0 = (be - ad)/b gives the recursive formula 

St = aS t _i 0 ( 3 ) 

for the sequence (S lt S 2 , S 3t ) 

Substituting a few values for Mn (3) gives 

S 2 ~ aS j + 0 

$3 = aS 2 + 0 = a[aSi + 0] + 0 — a~S\ -f- a0 + 0 

$4 = aS 3 + 0 = a[a z Si -f a0 + 0} + 0 = ct 3 S t + a 2 0 + a0 + 0 

$5 = 05*4 -f~ 0 = a[a 3 5x -f- a 2 0 a0 0] P 
= a 4 5’i + a 3 0 + a 2 0 -f ap + P 

* Devletoglou, E A . “Correct Public Prediction and the Stability of Equi librium; "Journal 
of Political Economy 69, 142-161 (196!) 

52 MATHEMATICAL INDUCTION / 101 



It seems reasonable to guess that, in 

Sl = c'-'Si + a" 2 3 + + ‘ + “3 + 

or, making use of Problem 11, 

311 ~ 


w 


S, = a Si + ~~ a 
In order to cheek on this guess, let V, denote the statement (4) For t = 1 


(4) becomes 


- o c , 3[1 ~ -ft 
Si = a Si 1 [ _ a 

= Si 


which is certainly true Now suppose that for some value of t, V, (that is, 
statement (4)) is actually true Then 

S, +l - aS t + 0 lby 13)1 

_ a ^ a «-i ( 5 1 ^ [inductive hypothesis 

A little manipulation simplifies this to 

, 30 - o') 

S l+i = a S i + t a 


( 4)1 


which is statement F t+ i 

To summarize, we have shown for the sequence of statements 


that 


(Yu V 2 , V 3 , ) 


(a) V i is true 

(b) if V t is true, then so is K t+l 

Tlie Principle of Mathematical Induction allows us to conclude that every 
statement V , is true and that formula (4) is correct for / = 1,2, 3, 4, ► 


It should be emphasized that mathematical induction is not an inductive 
process at all, but rather a deductive method used primarily to establish 
known (or suspected) formulas Examples 2 and 3 are typical instances 
Induction enters a problem when one has observed particular values for a few 
cases and, from these, attempts to infer a general formula 

PROBLEMS 

2 In Example 2 of Section 5 1, prove by mathematical induction that 
A n — P(1 4- m) for n = 1, 2, 3, 

3 Prove that the compound amount A n of a principal P invested at t% 
interest per year is given by A„ = PC 1 + i) n 


102 / SEQUENCES 



4. Population growth is frequently analyzed* by making use of the 
lationship 

/ population size\ 


population size \ 
at beginning of ] s 
(/ + l)st time period/ 


: I at beginning of ) + 
\/th time period/ 


/no of births in\ 
\/th time period/ 




( no of in-migrants \ 
in /th time period / 


no of deaths in\ 

/th time period / 

'no of out-migrant s\ 

~ \ in /th time period / 

Suppose that we make the following assumptions* 

(1) EfTects of out migration and in migration are negligible and can 
therefore be ignored 

(2) Birth rate « is constant over time 

(3) Death rate /3 is constant over time 

(a) Prove by mathematical induct, on that given these assumptions the 
model for population size P, at the beginning of the (t + l)st time 

period is , „ , « 

Pi = Pi(l + “ _ W for, -1.2, 

If a = 0 03 and 0 = 0 05, approximately how many time periods 
wtll be required to reduce the population to 25% of its original size’ 

5. A model of learning proposed by Bush and W >ssum« ifliat on 
the nth trial of a sequence of trials, the probability p„ of obtaining a cer 
tain response is related to the probability on the immediately preceding 

,rmlby 

= a + (1 - a - b)p n - 1 

Here a and b are constants which depend on . the respective per trial 
amounts of reward and inhibition Prove that for n - 1, 2, a, 


Pn = PoU ~ a 




■ (1 — a — b ) n ] 


where po is the m'tial probabihW of ^' uired t0 so lve this 
Note that no knowledge of probability y 
problem ) , , 

In Problems 6-17, prove by mathematical induction that the given state- 
ments are valid for each positive integer n 

, i 11/2 (Hint Let / be the function 
6 1+2 + 3+ +n = n ( n+l)/2J«+ 2 + 3 + +n and let 
defined for positive integers by H ) lhe fact that/(n + 1) = 

S n be the statement /(«) - "V* + ' ' „ v _ mn i es ) 

/(a) +(„+!) and proceed as in the examples) 

•Ma tas . Jt .. DM1 , ographlc Trends in Urban 
^ — Hauser, P M , ed (Unesco, 964 Learning." I* 

tBush, R R , and Mosteller. F , “A Mathematical Model 
chological Renew 58, 313-323 (1951) 


5 2 M 


ATHEMATICAL INDUCTION / 103 



7 l 0 -f 2 2 + + n- = n(n + l)(2n + l)/6 

8 1 + 3+ 3 s + + 3" - ’ ~ 1(3* - 1) 

912 + 23 + 34-1- + «(n + 1) = i“(" + '>(« + 2) 

10 2 + 2 2 + 2 2 + + 2 - = 2 "+' -2 

11 o + or + or 2 + + a,'-' = o(l - r*)/(t - r) for r ^ 1 

12 (1 + o) n > 1 + m for a<tt ° ^ 

13 n < 2 n (Hint Apply Problem 12 ) 

14 If c > 1, then c n+1 > c" > c 

15 If 0 < c < 1, then 0 < c" +1 < c* < c 

16 If a and 6 are positive, then a" < 6" if and only if a < b 

17 Let n' denote the product n(n — l)(n — 2) (2)(1) of the integers 1 

through n Then n* > 2 n_1 

18 Apply the result of Problem 1 1 to Problems 8 and 10 and to Formula (4) 
m Example 3 

A proof using mathematical induction will fail if either S\ is not true or if 
S n does not imply S«+ \ for all n The next two problems illustrate these 
possibilities 

19 Define the function / by f(n) =* 1 + 2 + + n and let S A be the 

statement 

/(„) = * 1+11 + 7 

(a) Prove that for all n if S„ is true, then S n+1 is true (Hint Argue 
that f(n 1> = f(n) + (« -V 1) ) 

(b) Argue that the result of Problem 6 means that all S n are false 

20 Referring to Example 2, let S n be the statement M n = n 2 - n + 1 
Prove that Si, So, and S 3 are true, but that S n implies S n+ 1 only if 
n ~ 1 or /1 *= 2 (Hint Argue that M n+l = 2 M n -f 1 ) 


5 3 CONVERGENT SEQUENCES OF REAL NUMBERS 

It is apparent that successive members of the sequence (P n ) in Example 1, 
Section 5 l, get closer and closer to the point 2 They also get closer to 3, 
to 4, and so forth, yet somehow we think of 2 as being special in this regard 
Why is this so'* The nth member of the sequence is 


104 / SEQUENCES 



Suppose we choose a number r > 0 and mark the point 2 — r on a line as 
shown in Figure 5 We know from the Archimedean property (and Prob- 


FIGURE 5 


2 

lem 13 of Section 5 2) that no matter how small r may be, there is an integer 
k such that l/2 fc 1 is smaller than r Hence for this value of k, we must have 
2 - r < P K < 2 as shown in Figure 6 Since the sequence is increasing and 


2-r 


-+- 


H — 


bounded above by 2, it follows that all the remaining members of the se- 
quence, that is, P k+i , P t+ i, P k+3 , , also >‘ e t ’ etween 2 . T r and t . 2 

as shown in Figure 7 In words we can say that the distance between the 


H 


Pk 


Pk+i Pk+1 


fixed number 2 and a member P k of the sequence can be made as srna “ “ ™ 
please (less than r) by choosing a member far enough atongmthe sequence 
(choosing n greater than k ) The number 2 is the on y r “ 2 its 

which this statement can be made and it is this proper y ® , f 

special significance In the sense of the following definition, 2 is the UmU of 
the sequence ( P n ) 

Definition 1 Let S = (*) be a sequence of real numbers A -toI« 

called the Unut of the sequence S if for- chnum' ~ ' 't^ or al, integers 
integer k such that the distance between s„ and L is less 
n exceeding k Symbolically, 

j 5ri — L\ < r when n> k 

H L is the limit of S, we say that S converges to L and we write 
L = lim S or L = 1™ M 

If s has no limit (that is, does not converge) it is said to diverge > 

A geometric interpretation of the limit of “ ^matter how small the inter- 
Tbe point L is the limit of the sequence (s n ) point can ^ f ounc j 

'al lying between the points L — r anc * k, -r r 

ENT sequences / 105 



m the sequence beyond which all members of the sequence 
interval 


lie inside the 


figure 8 

-4===£==U- 

L-r L L + r 

Example 1 (a) If c is a real number, the constant sequence determined 

by c is defined by 

S = (s„) - (c, c, c,c ,.. .) 

This sequence converges to c since, if r is positive, we have 
j 0j> _ c j « J c — c \ = 0 < r for all n > 1 

(b) Let H denote the harmonic sequence (Ji n ) = (l/«) = (1» h h • • •) 
illustrated in Figure 9 The Archimedean property of the real numbers 


FIGURE 9 



guarantees that for any r > 0, there exists an integer k such that 1 /k < r. 
Then if n > k we have 


0</>„ = -<)-<r 
n k 

which means that the sequence H has limit zero 


► 


The definition of limit does not require that the members of the sequence 
all lie on one side of the limit point, as in Example 1 in Section 5.1 and Ex- 
ample 1 above For instance, the sequence (1, — 1, — A A — a . . ) 

is closely related to the harmonic sequence and also converges to zero. Here 
successive members of the sequence lie on opposite sides of the limit point 
zero, as shown in Figure 10 

FIGURE 10 


The members of a convergent sequence need not get successively closer 
to the limit point The sequence 

(£.)=■ (1, 1.4, i. 4, b 1,414.. ) 

= 0, i 2 , 4, (l) 2 , 4, (i)\ J, (if, .) 

106 / SEQUENCES 




converges to zero as shown in Figure 1 1 Here the members of the harmonic 
sequence alternate with their squares We never reach a point in the sequence 
beyond which terms become successively smaller But for any number r > 0, 
we do reach a point beyond which all terms are less than r. And this is the 
criterion for convergence to zero 

FIGURE 11 

S 4 
and 

..S 8 S 6 S 7 5s 


PROBLEMS 

1. Prove that the sequences 

( 1 , “ 1 » 2 > — ^ 

and 

(i,i, i, (i) 2 , i.(4) 2 . ) 


Si and S2 
1 


both converge to zero 

2. (a) A sequence S = (,„ «. «, ) is known .0 have zero as a limit 

Argue that this sequence cannot also converge to 2 

(b) Give an example of a sequence which is not bounded 

(c) Give an example of a sequence which is bounded but does not 
converge 


We now turn to consideration of some basic properties of convergent 
sequences 


Property 1 A convergent sequence has a unique limit 

a — (n \ has two distinct limit 
PROOF Suppose the convergent sequence ^ ( the number r * 

points Lx and L 2 with Lx < £2 ( see F J S ur 
K ^2 — L{) is positive and we have Lx + r 2 


FIGURE 12 



Now if £, is a limit of (o„) there exists an integer A such that 
the element a n is in the interval 


/1 = O’ Li — r < ) 


<L,+r) 

gent sequences / 107 


53 CONVER 



Similarly, there exists an integer m such that a n is in the interval 


h = {y L 2 - r <y <L 2 + r} 

when n > m But this means that when n is larger than both k and m, a n 
must lie in both I x and J 2 , an obvious impossibility Our supposition that L\ 
and L 2 were different must be false, which is what we wished to prove ► 


, ak—\, the fol- 


Property 2 A convergent sequence is bounded 

PROOF Suppose the sequence (o„) has limit L Then (taking r = 1) there 
exists an integer k such that 

L - 1 < a n < L + 1 for all n > k 

If b denotes the smallest of the numbers L — 1, a, a 2 
lowing statements are true 

0) By definition b < a n for n « 1, 2, 3, ,k - l 
(n) By definition b < L — 1 
('") We tlav = i - 1 < a, for n = K, k + 1, k + 2 

,hat ^ h alower b0 ™ d for *■ 

6eo :r;'i'n± sh °r ^ sir rr :: 

Ibesc, and since a 3 < L -TilTlf" 't* “ 3 1S the smalkst of 

the point L - 1 ) * 3 4 n > k, a n must be to the right of 

Figure 13 


L - 1 


A+l a 2 


alw°ha ra^uppcrbound^Hence (ol)7s bounded C ° nVer8ent ( *> 


PROBLEMS 


3 Complete the proof of Proneriv ? k„ u 

numbers L + \ t 0lt C2 P y ,^ y showin 8 that the largest of the 
upper bound for lhe sequence *“* °° r example In Figure 13 ) is an 

4 Docs the sequence (I, 2, 3 4 \ 

Whal does the Arch.medea’n property sayT Why " *** ^ (HM 

(n„) also converges to L ° ^ conver 6 es t0 L then every subsequence of 
PROOF \\c know that in order Tor (b 1 m 

must be obtained by choosing members or r ^ “ subscc l u<:ni;c of (<r„) it 

1 (o„) m such a way that sub- 
lets / SEQUENCES 



scripts on the o„ appear in ascending order That is, 

bi = n mi , r ^13 1 Hm3, 

where m x < m 2 < m 3 < Obviously, mi > 1, and if m t > k, then 
m k+ 1 > k + 1 since m l+1 > m L This simple inductive proof allows us to 
conclude that m*. > k for k = 1, 2, 3, 

The fact that (o„) converges to L means that for any number r > 0, there 
is an integer N such that | a„ - L\ < r whenever n>N Thus if k > N, we 
have | b k — L\ < r since bi = a mi and mi > k > N This proves that 
L = lint (bi) ^ 

PROBLEMS 

5 Check the validity of Property 3 by looking at a few subsequences of 
the sequence (1, —1, 2 , ~b 3» ) 


Example 2 Let us define a sequence ( S „ ) by adding the terms m the har- 
monic sequence That is, define 

Si = 1, S 2 = 1 + 2 . = 1 + i + h 

and, in general, 

S„ =1 + 5 + 5+ 

Consider the subsequence of (S„) whose subscripts arc powers of 2 
Members of this subsequence are bounded below by mu tip es o 2 , 

S2=I + i>0 + 4= } 

S <=l + £+ $+ l>°+2 + J + 4 = 2(i) 

= 1 + 1 + 4 + i + i + l + i + i>°+^ + ^ + i + l + i 

+ $ + i — 3(4) 

In general, we find that S,. > i« This means that 1 the 
(S* S„ si S 1C , ) is not bounded and, using Properly 2 must diverge 
Hut this implies (Property 3) that the original sequen ( i, ^ 

also diverges 

Property 4 If A = («„) and B = (W are convergent sequenees with 
a = hm A and b — lim B then 
(>) Um (A + B) = a + b 
<») lim (A - B) = a — b 
(hi) hm ( A B) — a b 

»n addition, B has no zero elements and b if 0, then ^ 

(“) lim (A/B) = a/b 

63 CONVERGENT SEQUENCES / 



Similarly, there exists an integer m such that a n is in the interval 


/ 2 = {y L 2 - r<y <L 2 + r) 

when n > m But this means that when n is larger than both k and m, a n 
must he in both Ii and I 2 , an obvious impossibility Our supposition that L\ 
and L 2 were different must be false, which is what we wished to prove ► 


Property 2 A convergent sequence is bounded 

PROOF Suppose the sequence (cr n ) has limit L Then (taking / = 1 ) there 
exists an integer k such that 


L— l<a n <L + 1 for all n > k 

If b denotes the smallest of the numbers L - I, a u a 2 , 
lowing statements are true 


, the fol- 


(l) By definition b < a n for n == 1, 2, 3, 
(u) By definition b < L — 1 


(in) We have L - 1 < o„ for n = k, k + 1, k + 2, 

,l,al b ‘™" d 'If 

“i'“± : t "»-• ii. arcsnr i 

these, and since a 3 < I. ,1" lbe sketch . °a is the smallest of 

the point L — 1 ) ’ 3 * If n > k, a n must be to the right of 


FIGURE 13 


1 


£+1 a. 


also has an lipper'bound^Hence (^,)'is tKmnded C ° nVergl:nt se< l uencl: <M 


PROBLEMS 

3 Complete the proof of Proneriv ? k u 

numbers L + l, a 2 , P a V showin S that the largest of the 
upper bound for the sequence *"* ' ° F exam P^ e » Q 5 in Figure 13) is an 

4 Does the sequence (1,2 3 4 \ 

What dees ,he Arch.medea'n property "IyT’ Why ” Why m>t ' > (H "" 

(o„) also converges to I. conver 6 es to L then every subsequence of 

PROOF We know that m order for (6 1. t, 

must be obtained by choosing members of r b ? “ subsel l uence ° r («») 

b memoers of (a,) such a way that sub . 

108 / SEQUENCES 



scripts on the a n appear in ascending order. That is, 


b\ = o mi9 b 2 — a <v 


b a — 


where m, < m 2 < m 3 < • ■ • . Obviously, mi > 1, and if «t > th ™ 
m i+1 > k + 1 since m (+1 > me This simple inductive proof allows us to 

conclude that mi > k for k = 1, 2, 3, . - - - , _ n 

The fact that («„) converges to L means that for any •numberr MUhere 

is an integer N such that K ~ L \ < ' " fc > N This proves ’that 

have |&*, — L\ < r since bi = a mk and mi > __ ^ 

L = lim (b k ). 

PROBLEMS 

5. Check the validity of Praperty 3 by looking at a few subsequences of 
the sequence (1, — 1* ~~2» 3 » 3 » • •'* 

(q \ Kv aHdine the terms in the har - 
Example 2 Let us define a sequence (S n ) by adding 

monic sequence. That is, define 


Si 

and, in general, 


S 2 


l + i, s 3 = l + 4 + §•■•• 


s» = i + \ + 5 + 


_ . t ,<■ \ whose subscripts are powers of 2. 

Consider the subsequence of (S„) multiples of J. That is, 

Members of this subsequence are bounded below t>y mu 

S 2 =l + 4>0+i=i 

S 4 = i + i + 4 + i>o + i+i + i = 2(i) 

s 8=l + ^+ 3+ ? + i + e + 7 s - = 3(4) 

: a This means that the subsequence 

In general, we find that ft" > , usin E Property 2, must dnerge. 

(S 2 , S 4 , S„ S,o, ■ • ■) is n °t^Tfd.ongTnaf^qucnc= (ft. ft. ft. - 

But this implies (Property 3) that the E ► 

also diverges. 

x . n - (b ) arc convergent sequences with 

Property 4 If A = ( a ») anc * 
a = lim A and b = lim B then 

(I) lim (A + B) = a + b 

( II ) lim (A - B) = a ~ h 

(m) lim M ■ B) = « ' * , hcn 

If. in addition, B has no zero elements an ^ 


(i\) lim (A/B) = a/b. 


„ coNVinoCNT scouinccs 


/ 1M 



Result (i) is often expressed by saying that the limit of the sum of two con- 
>ergent sequences is the sum of their respective limits, or simply, the limit 
of a sum is the sum of the limits Similar comments apply to the other cases 
We shall prove (i)and (in), and we leave (11) and (iv) as exercises (Problem 12) 


proof OF PROPERTY 4 To prove ( 1 ) we must show that for any 
r > 0, there is an integer k such that 

|(«n + b n ) — (a -f- 6)1 < r 

when n > k Since a = lim A, there is an integer k x such that 


I On — a\ < ^ when n> ki 

Similarly, since b = lim B, there exists an integer k 2 such that 
I b n — 6| < ^ when n > k 2 

Now let k be the larger of k, and k 2 Then for n > k we have 
l(“. + K) - (a + 6)| = |(o. - a) + (6, - b)\ 

~ k* “ a l + )6„ — fe| [Triangle Inequality] 
<2 + 2' =r 

To prove (in), we note that |o.6. - ab \ can be written 

k6„ - o6| = [a„6„ - ab„ + ab n - o6| 

- |6„(o„ - <r) + a(b n - fc)| 

S 16-1 |o. - n| + |o| |6„ _ 6| 

“< W <n " “ bOUnded ’ ,here * number u > 0 

'>0.c canfmd an ,n Lr 6 such", h T n Pr °° f ° f «■ for 

integer k such that when „ > k, we have simultaneously 

h " ~ a,< £, and l*» - 6| < 

Ofn - 0, the latter quant, ty ,s not needed ) Hence, for „ > k, 

Kb " ~ 5 “1°" - «l + |o| |6„ - 6] 




' 2|n| 1 


W If ' f0r ““ C °™ a "> the sequences (o.) and (6.) are 

6. = o„ + c 

HO / SEQUENCES 



and if ( a n ) converges, then Property 4(i) gives 
lim ( b n ) = Iim (a n + c) 

= lim (a n ) + hm (c) 

= lim ( a „ ) -f- c 

(b) If for some constant c, ( a n ) and (b n ) are related by 

b n — C O n 

and if ( [a n ) converges, then Property 4(m) imphes 

lim ( b „ ) = lim (c a n ) = [lim (c)][lim (a„)] = c lim (o„) ► 

Example 4 When a test having reliability r (0 < r < 1) is made n times 
longer, the reliability R„ of the new test is, according to the classical Spear- 
man-Brown formula,* given by 

nr 

Rn = 1 + (n - 

For a fixed value of r, the sequence (R n ) has the limit 

i™ ( n„) = ta (iqrfny) 


((l/n) + [1 - 0/iOp) 


lim (r) 

n— *eo 


lim (l/n) + [hm (1) - lim (l/n)]r 


0 + (1 - 0)r 

1 


[Dividing top and 
bottom by n] 

[Properties 4 (i)-(iv)] 
[Example 1 (b) J 


thus by simply increasing the length of a test, we can make its reliability as 
close to unity (perfect reliability) as we please 

PROBLEMS 

„ / 1 rn ,nd <c») are related by a. < b« < f. 

6. Suppose the sequences (a B ), a1 1 . ( ) haxe a common limit L , 

for n = 1,2, 3, Show that if (o.) ana 

then the limit of (6„) must also be L 5 2 

7. Prove that ton 0/21 - 0 (H« From Problem .3 of Section , 
we know 0 < 1/2" < l/n Apply Phobic > 

8. Find l,m (4 - 1/2-=) _ „ (//m , nrst 

9. Prove that lUna positive mteew.m^^ ^ np pt> Problem 6 ) 

proxc by induction that l/n > »/ 

. * Ovnt Inc.. Nox Tiork. 19x1). 

Gulliksen. H . Theory of Mental Tests (John > O 
Carter 8 

„ CONVERGENT SEQUENCES / 



10 Use Property 4 and Problem 9 to determine limits (if they exist) of the 
sequences whose nth terms are 


(b) 2n + (-1)" 



4n 2 — 5/2—2 

11 Prove that if hm (a„) exists and 
hm (o n ) < u Give an example in 
which inequality holds 

12 Prove Properties 4(u) and 4(iv) 

13 Prove that lim (2/\ /W+ 1) = o 
l/« > 1/Vn2 + 1 > 0) 

14 Prove that if A - („„) has limrt a 

(a) lim.4 2 = a 2 

(b) lim (- fln ) = 

15 Show by example that if lim (A + 
that either A or B converges 


(d) 

(0 

(h) 


« 2 + 2 2n 2 — 2 

(- 1 )" 

6n 3 

n(n + 2) 

(« + l)(/t + 4) 


u is an upper bound for ( a n ), then 
which equality holds and another m 


(Hint n 2 < n 2 + 1 and therefore 
then 

= L, it does not necessarily follow 


5 4 CONVERGENCE OF MONOTONE SEQUENCES 

— vz be “ <**•« y * » . 

('•-l.l.-U.-l ) ww“r gent For exam P>=. ‘he sequence 

<a.nly bounded but ,s not’ convert"' Z™ alte dy 1 a t> d -1, » cer- 
sequences, in particular those which are ther . e are certaln ‘yP es of 
decreasing, for wh.ch boundedness ,s ZZIZZ ” 

quence converges to its least u^boS” n A , bounded '"creasing se 
"verges to its greatest lower bound A bounded decreasing sequence 

leaving the Moid'aTnnwrew qU ' te S """ ar - We Sha " Pr<™ °nly the first. 
If the increasing sequence (n \ u 

Then ii is an upper bound, but" for anv r 1”* “ least up P er bound “ 
upper bound This means that there exsts^ °’ ‘ h ? nuraber " ~ r is not an 
"h"* " WS a memb « a„ of the sequence for 

“ ~ r < a k < „ 

112 / SEQUENCES 



Since the sequence is increasing, we must have 

ti — r < a k < a n < u for all n > k 

(This is illustrated in Figure 14 Note that when n > k, the distance between 
u and a n cannot exceed r ) 


FIGURE 14 

not an upper bound 

H 1 

a i a 2 




u—r 


+ 


-H 

a k 


least upper bound 

F 

. , . a„ • • • 



To sum up, we have shown that for any r > 0, there ex.sts a number k such 
that I a„ - u\< r for all n > k This implies that u = lim (n„) ► 


Example 1 Reread Example 1 of Section 5 1 and the persuant discussion 
at the beginning of Section 5 3 


Our original definition for convergence has the distinct disadvantage that 
in order to be used, we must already know, or at least suspec , e 
value of the limit We then merely verify that our suspicion is correct 
in Examples 1, Section 5 1 and 1(b), Section 5 3 it was readily apparent hat 
the limits were, respectively, 2 and 0, and we were able to show that these 
values satisfied the requirements of the definition 

On the other hand, Property 5 allows us to assert that a !, ™‘ t “‘ st * e ™ n 
when no particular candidate presents itself This, together with Property 3, 
often allows an easy solution to the problem of finding limits 


Example 2 Let ( a n ) be the sequence defined recursively by 
fli = 1 


a n +i = 


2a » ± 2 for 7i > 1 
4 


A , . «*1 „ ^ /r„ = 4 < 2 Proceeding mduc- 

A simple calculation shows that 1 — a * ^ 4 

b'ely, if a n _ 2 < a n < 2, then 

+ 3 „ 2^+3 i+_3 < 2 

4 < 4 4 

which implies that 

< fln + l < 2 

Thus lhe sequence (n„) is increasing and n^rccs to a limit I Since 

U follows from Property 5 that this seque an d hence, by Propcrt> 3, 

) is a subsequence of (n„)anuncn 

54 MONOTONE SEQUENCES / 113 



problems 


Stephan and Miachte* have found that the rate of total participation of 
an individual in a small discussion group is a function of his rank ord 
according to total number or "acts of interaction” during the discussion 
period and follows the form 

n, = or* -1 for i = l» i n 


where n is the size of the group excluding the leader, r and a are con- 
stants, and p, is the estimated proportion or total participative acts 
initiated by the ith ranking group member 
The leader’s interaction rate is 


L = l~[Pl+/ , 2+ + P”\ 


since the proportions must sum to 1, that is, 

In + Pi + P2 4 - + Pn = 1 

(a) Find the limiting value, if it exists, of the sequence (/„) representing 

the leader’s interaction rate ( Hint Use Problem 11, Section 5 2) 

(b) Does the limit exist for all real values of a and r 1 If not, indicate the 
set of values of a and r for which the limit does exist 


4. In Example 3 of Section 5 2, prove that the sequence {Si) 

(a) is convergent when — 1 < d/b < 0 

(b) is bounded but divergent when d/b = — 1 

(c) is unbounded and divergent when d/b < — 1 


5. In Problem 4, Section 5 2, prove that (P,) is convergent whenever (3 > a 
and divergent whenever a > 0 Interpret these results in terms of the 
model for population growth 


6. Suppose that (a„) is the sequence defined recursively by 

oi =* 2 and o n+ i = ca n -f- A for n > 1 
where 0 < c < 1 and A. is a real constant 

(a) Find o, in terms of oi, c, and k 

(b) Find lim,_* ( a„ ) 

(c) Apply these results to Equation (5) of Example 4 

7. Prove that if two subsequences of a given sequence converge to different 
limits, then the sequence does not converge 

8. In Problem 3, Section 5 2, you have shown that the accumulated amount 
resulting when a principal P is invested for rt years at interest rate i com 
pounded annually is 

> = PI 1 + O' 


" E ^ Dlslrib “'iO" of Participation in Small Groups 

An Lxponcnnai Approximation American Sociological Renew 17 , 598-603 (1952) 


116 / SEQUENCES 



(a) Show that if interest is compounded m times a year, the general 
formula is 



where r = m/i 

(b) Show that if for each positive integer r we let j, = [1 -f (l/r)] r and 
t r « [1 + (l/r)] r+1 > then 

(l) (s r ) is an increasing sequence (Hint Show that Sr+i/5r > 1 
by writing s r +\/s T in the form (1 + l/r){l — [l/(r I) 2 ]} r+1 

and then applying Problem 12 of Section 5 2) 

(n) (/ r ) is a decreasing sequence 
(ill) s r < tk for every k and r 

(c) Given (i), (n), and (m), prove that the sequence [1 -f (l/r)] r has /i 
as an upper bound and therefore converges 

The mathematical constant e is defined as the hmit as r—* » of the se- 
quence of numbers fl + (l/r)) r Hence, if the number of compoundings 
within a year is extended indefinitely, 

y = Pe' n 

which is a formula frequently used m investment theory 

9. Prove that every decreasing sequence converges to its greatest lower 
bound 

10. Prove that if (a„) is an increasing sequence bounded above by 1, then 
lim (o„) exists and is less than or equal to 1 Give an example in which 
the limit is Jess than 1 

11. If a* = (— 3)", find lim (a„) 

12. If a* = (-1)" +1 (« - l)/2", find lim (a n ) 

13. In Problem 5 of Section 5 2, find lim,,-.* ( p n ) What conditions must be 
imposed on the constants a and b in order to insure convergence 9 

14. As a description of the extinction process. Bush and Mosteller* propose 
the relation 

Pn = Pn - 1 ~ bp n - 1 = (I — 6)pn - 1 

•Bush, R R , and Mosteller, F , “A Mathematical Model for Simple Learning,’* Psycho- 
logical Rnie h 58, 313-323 (1951) 


54 MONOTONE SEQUENCES / 117 



between the probabilities p n ~i and p n of response on the (n — l)st and 
nth trials, respectively The positive constant b is called an extinction 
parameter Prove that the sequence ( p „ ) 

(0 converges to zero if 0 < b < 2 

(n) is the constant sequence ( po , po, Po, ) if b = 0 

(m) diverges if b > 2 


SUPPLEMENTARY READING 

Anderson, K W , and Hall, D W , Sets, Sequences and Mappings (John Wiley & 
Sons, Inc , New York, 1963), Chapters 2 and 4 


118 / SEQUENCES 



INFINITE SERIES O 


6.1 SUMMATION NOTATION 

The capital Greek sigma 53 notation is a shorthand method in mathematics 
for designating sums. As an example, the sum of squares of the first seven 
positive integers may be written 

l 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 + 7 2 = Y, k 2 

Jt~i 

The symbol k 2 indicates the form of the quantities which are to be added. 
The symbols <l k = 1” below and ”7” above the summation sign 53 indicate 
that 1 is the initial value taken by k and that 7 is the terminal value. It is 
always understood that every integer from the initial value to the terminal 
value, inclusive, is to be included in the sum. Conventionally, the initial value 
is the lesser of the two values. The symbol k is called the variable of sum- 
mation. 

Quantities to be added may take a variety of forms. If, for instance, the 
function / is defined for each integer k by f(k ) = 2k -f 1, then 

£/<*) 


m+/(5)+/(6)+/(7) 

- (2 • •» + 1) + (2 • 5 + I) + (2 • 6 + 1) + (2 • 7 + J) 

- 9 + II + 13 + 15 

- 4S 


«.1 SUMMATION NOTATION / 119 



Similarly, 


Efc=0+l+2+3+4+5= 15 

fc=o 

Each of the above examples is an illustration of the basic definition 

E/W = /(«) +/(m + 1) + +/(«) (1) 


PROBLEMS 

1 Evaluate the following sums 


(a) E (r - 3) 


(b) E 2' 


(c) E » 2 - 


The variable of summation often appears as a subscript For instance, if 
we have four quantities x, - 3, * 2 = 6, x 3 = -4, and * 4 = 2, then 

4 

52 ■ x fe = ^i + JC2 + x 3 -l-x4 = 3-}-6 — 4-f-2=7 

while 

4 

E ( 2 x„ - 1) = (it, - l) + (2 jc 2 - 1 ) + (2x, - 1) + (2*4 - 1) 

“ (6 - *) + < 12 - 1) + (-8 - 1) + (4 _ i) = io 

PROBLEMS 

2 Given that xi - 2,* 2 . . 6 , and „ _ _ 4 , find 

W ? I 4 *' (b) E x t 

5 ’ =1 

(c) 12 (3 x k + n ,, N , 

52(*,+ l)(x, + 2) 

3 Rewrite the following expressions in 2 Nation 

(a) l+ I + T + 7 + -+^ 

2 4 8 16 32 

(b) xl -f x l + x 2 __ 4 

4 Rewrite the following ,n ordinary notation 


(a) E 2 " 


fc) E *0+ 1) 


(b) E *.i 

3-1 

«) E 


120 / INFINITE SERIES 



It is important to observe that the symbol used for the variable of sum- 
mation is entirely arbitrary and for this reason is called a dummy variable 
For instance, 

n n n n 

Xl + X2 + + X n = E Xl = E X, = E Xl = E 

fc=l J=1 1=1 IP=1 

and 

4 4 4 

log 2 -f- log 3 + log 4 = J2 log k = 2 log ./ = 2 log r 

2 j=2 r=2 


In addition, the quantities to be added may appear in different forms and 
yet their sums may be identical As examples, 

l 2 + 2 2 + 3 2 + 4 2 = E k 2 = 2 (fc - l) 2 


and 


« E (* - 9) 2 = E (fc + 3) 2 

fc =10 fc =— 2 

n n — 1 7i+l 

+ *a + + = E ** = E *i+i = E 

*=1 *=.0 * 2 


(2) 


In each case the initial and terminal values of the summation variable 
have been altered, but the quantities being summed have had their form al- 
tered in a compensating manner so that the value of the sum is unchanged 


PROBLEMS 

5 Compute 

i> 2 + e 0/ - 1) 

fc=l ,=-1 

and compare with 

7 5 

E (r - 3> 2 + E B« - 3) - ]] 

r — 4 t=2 

(You should obtain the same result in both cases ) 

A convenient way of obtaining the various summation forms in (2) is by 
making a change of variable For example, to obtain xt+i from 

y^n Xk, we first replace k by r + I to obtain 

2 *k = £ *r+l ( 3 ) 

L=1 r+l = l 

Now, if r -f- 1 runs from 1 to n, then r itself must take values from 0 to n — 1 
(That is to say, if r + 1 = 1, then r — 0, while if r + 1 = n, then r = n — 1 ) 
Thus, we have 

n n — 1 

E Xr+l = E *r+l <«> 

r+t—1 r — 0 


61 SUMMATION NOTATION / 121 



( 5 ) 


Replacing r by k in this last sum yields 

n-1 n-1 

£ *r+l “ 52 

r=0 k=Q 

Combining Equations (3)-(5) produces the desired result 

PROBLEMS 

6 Compute 

E ft 2 - 3* + 2) 

i=4 

Note that this is identical with 
7 

E(k- m - 2) 

1-4 

Now make the change of variable r = k — 2 to obtain 
5 5 

£ r(r + 1) = 52 fr 2 + r ) 

r-2 r— 2 

Compute this latter sum to see that its value agrees with the first com 
putation 

The basic rules of operation for sums parallel the associative and dis- 
tributive properties of numbers (Section 1 4) and are easily obtained from 
the definition (1) Thus, if in the sum 

52 *k *1 4* *2 4" *3 ■+■ + x n 

i-1 

each of the x* is equal to the same constant c, then 

£ *k = c + c + + c = nc 

k-1 

Thus wc obtain our first rule for sums 
Rule 1 for Sums If each x* = c, then 

52 Xk = 52 c = (n - m + l)c ► 

Example 1 52 4 — 6(4) = 24 and 52 $ = 10(£) =15 ► 

1 “ I t 2 

Combining ihc two sums 

*k 5=1 Xt +• X2 -V -f- X r 


122 / INFINITE SERIES 



and 

D = *r+l + *r+2 4* ’ ‘ ’ + X n 

k=*r+l 

we obtain our second rule. 

run 

Rule 2 for Sums 53 **. + 53 ** — 53 x k 

£ = t *«=r+l A»=I 

Next, suppose that for each value of k we have Xk 
c. Then n 

£ = 53 taflt) = Ob + o>2 + 

UI A = 1 

- c(yi 4- >’2 4- 

-» c 53 r* 

t=i 

Similar considerations yield our third rule. 

« » 

Rule 3 for Sums E (ex*) = c E ► 

L=m r=m 

Rule 3 says that constant factors may be written either inside or outside 
the summation sign. This, of course, is merely another way of writing the 
distributive law for numbers. 

Our fourth rule follows from the commutative and associative laws for 
real numbers. 

Rule 4 for Sums 53 (** 4- )’k) = 23 •** + S yk ^ 

fc=m k=m k=m 

This is easily proved by noting that 

n 

(XI + yi) ~ ( x m + y „ ) + (x„ + , + y„ +1 ) H h (x n + y n ) 

~ (x m 4 * x m+i 4 - • * • 4 " x „ ) 4 - (y m 4 - ym+i 4 - • • • 4 - yn) 

= 23 *k + X) yi 


► 

= cyi for some constant 
f cy n 

— h y n ) 


PROBLEMS 

7. Use the above rules for sums to simplify the form of 

7 8 12 

E (3** + 6k + 1) + E (-3 r 2 + 2r) + E 2A 

i— 1 r-2 i-8 


First write r = A. + I.) 


61 SUMMATION NOTATION / 123 



Example 2 An investment yielding fixed periodic payments is called an 
The presen , va,ue of an annuity ,s the amount which must be m- 

vested initially in order to provide the payments 

Let us suppose that you wish to receive ST annually for the next 11 yea 
from an annuity on which the remaining principal is compounded annually 
at 100i% Since P. dollars invested now will amount to P,(. 1 + i) dollars 
,n, years (see Problem 3, Section 5 2), the amount to be invested now in 
order to obtain ST in j years is P, = ST/(1 + Hence the present value 
of this annuity is 

P=±r,= t 


O 0 + ')> 


PROBLEMS 

8 Use Problem 11 of Section 5 2 to show that in Example 2, the quantity 


may be rewritten in the simpler form 

9 What amount must be invested at 5% interest in order to guarantee 20 
yearly payments of $10,000 9 

10 Show that 


k n — k n 

(a) 52 ** + 2 *‘+* ~ H x, 

»-i >=i «-i 

(b) (k*j + my,) = kit x, + mit y, 

2-1 2-1 

11 Suppose that a business enterprise is so organized that the president has n 
immediate subordinates each of whom also has n subordinates, and so 
on If the president is regarded as the first level, show that the total num 
ber of personnel in an organization with L levels of authority is 



12 Suppose that we wished to approximate an organizational structure 
with six administration levels by the simple hierarchy in Problem 11 

(a) How many employees would there be required if the span of control 
(number of subordinates per superior) were (i) three 9 (») four 9 

(b) If the number of levels is doubled, how many more people are re 
quired 9 

13 M x = (1 /idXt-i x k is the mean (average) of the n numbers xi, * 2 , 

, x„, prove that £«— l (x, — x) = 0 The differences (x» — *) are 

124 / INFINITE SERIES 



termed the deviations of the x t from their average x In effect, you are 
being asked to prove that the sum of deviations from the mean (average) 
of a set of quantities is zero, a fact often used in statistical computations 

14. Show that 

(a) 2 ( x i + OV, * 2 x*Ys + 2 2 x,y, + 2 Yj 

j«=i j«=i j-i j-i 

(b) 2 ( X J “ xf = 2 *7 - 

j=i 

where 3: is as defined m Problem 13 

15. Mr Franklin has S10,000 with which to purchase a 4% annuity com- 
pounded annually What is the amount he will receive annually if he 
elects to receive income payments for 

(a) 2 years'* (b) 5 years 9 (c) n years 0 
{Hint Refer to Problem 8 ) 

16. A company with a simple hierarchal organization structure as described 
in Problem 11 has called in a consultant to do a time study of the com- 
munication flow from the president’s office to the lower levels Suppose 
that company policy requires all employees at a given organization level 
be contacted before the communication is transmitted to the next lower 
level As a result of his analysis, the consultant reported the following 
element times 

t p ~ the average time (in minutes) required for a subordinate to process 
the message 

tp, = the average time required to transmit the message between sub- 
ordinates at the same organizational level 
tb — the average time required to transmit the message from one organ- 
izational level to a designated contact on the next lower one 

(a) IF there are L levels in the organization and a span of control of n, 
show that the total average time required for a message originating 
in the president’s office to be processed sequentially through the 
entire organization is 

L . 

T Op + tv,) + LUb — tie) — Ob + t p ) 

n — l 

(b) Compute the average time for sequential circulation of a message if 
n ~ 4, L = 6, t p = 10, /„ = 30, /& = 10 What suggestions would 
you make? 


6 2 CONVERGENCE OF INFINITE SERIES 

How can a value be attached to an expression of the form 

a \ a* 4- 03 + 04 -f* • • • -f- a n -f- • • (6) 


62 CONVERGENCE / 125 



Example 2 An investment yielding fixed periodic payments is called an 
annuity The present value of an annuity is the amount which must be in- 
vested initially in order to provide the payments 
Let us suppose that you wish to receive $7 annually for the next n years 
from an annuity on which the remaining principal is compounded annually 
at 100i% Since P , dollars invested now will amount to P,(l + i) } dollars 
in j years (see Problem 3, Section 5 2), the amount to be invested now in 
order to obtain $7 in y years is P, — S7/(l -f- i) 1 Hence the present value 
of this annuity is 


i>=i :r,~t 

i—i i=i 


Y 

(l + /)> 


► 


PROBLEMS 

8 Use Problem 11 of Section 5 2 to show that in Example 2, the quantity 


,-i a + «y 

may be rewritten in the simpler form 


-4-(rh)'] 


9 What amount must be invested at 5% interest in order to guarantee 20 
yearly payments of $10 000 f 

10 Show that 

(a) £*,+ I *,+» -± x , 

- 1 ‘-1 .=1 

°>) j) (kx, + my,) = k Z *, + m £ y , 

lmmerhn t ^ Ult l bLlslness enterprise is so organized that the president has tr 
imoiediate subord, nates, each of whom also has n subordinates, and so 

ber of nerf r‘ “ reg!lcded “ lhe level . *ow that the total num 
ber of personnel in an organ, zat, on with L levels of authority is 

T = ~~ 1 

n — 1 

With* MX a dm! W1Stiec * to approximate an organizational structure 

with six adm.mstrat.on levels by the smtple hterarchy m Problem 11 

(numte n of tT'T 5 WO "' d "’ ere be re< l ulred lf th = s P a " ° f “ n,ro1 

v f ubord,nales PW superior) were (i) three’ ( 11 ) four’ 
quired’ Um ^ er ° f leVels ls dou bled, how many more people are re- 

” ” *, ,h r m “" ^ avera Se) of the n numbers xi, x 2 , 

£.= l (tr. x) - 0 The differences (x, — x) are 

/ infinite series 



FIGURE 1 


1 


°2 "2 



i*l-l 


°4“f 

H — 1 — H 


M"T 


We recognize Sas the sequence of partial sums of the infinite series generated 
by the sequence A. The Ath partial sum is 


« = 2 


1 

2 i-i 


Since lim (s L ) = 2, we have 

E (4)" = £“(**) = 2 

n-=»0 

as the sum of our infinite series of step lengths. 


The infinite series of Example 1 is a special kind of geometric series whose 
general form is 

£ *” = 1 + X + x 2 + X 3 + ■ • ■ 

n=0 

Simple multiplication shows that 

(1 - x)(l + * + + • • ' + ** _1) = 1 " ^ 

so that the Ath partial sum of a geometric series is 

il = 1+ , + ^ + - + -‘- 1 = T^T (x?£l) 

If * = 1, then H = A. For instance, in Example 1, x = 4 and 

m «— » x »> “ “Arssrsrs 

SfM ™ r, »< a- - '• 

sum is 

“ „ .. ( I = _i_ (9) 

T, x = i^i _ x) 1 - * 

For all other values of x, this series has no sum (that is, it diverges). 

Example 2 A. common assumption made in economics is that if SI is 
spent in goods and services, total income will incr • 

A(0 < A < 1) of the increase in income is m turn spent for goods and ser- 

6.2 CONVERGENCE / 12T 



or to the form 


(7) 


5 >„ 


which we take to be equnalent with (6) 7 The answer to this question of how 
to add a sequence of numbers (<Zi, a 2 , a 2 , , a n , ) constitutes the study 

of infinite series 

It is important to note that there is no particular value which we are able 
a priori to assign to an array of symbols which calls for an infinite number of 
additions The associative and distributive properties for addition of real 
numbers (Section 1 4) provide rules by which two or three numbers may 
be added These rules may be extended by induction to any finite set of num- 
bers, but nowhere in our previous discussion is there any rule which governs 
the present case We must consider new rules 

We begin with a sequence (a u a 2 , a 3 , ) and our problem is to “add” 

all these numbers The procedure for doing this is quite simple We consider 
that the first number a, forms & first sum, s l = Ql We then add a 2 to 
to form a second sum , s 2 — ai + a 2 Similarly, we obtain a third sum, 
5 3 =* ai + a 2 + a 3 , a fourth sum, s 4 = + ct 2 + a 3 + a 4 , and, in gen- 

eral, for any positive integer k, a kih sum 


- a 1 + a 2 + + °k = $2 a n 


(8) 


jP*? sequence S = (s„s,,s 3 , s k , ) Wlt h s k defined by (8) is 

called the infinite senes generated by the sequence A = („„ a 2 , a 3 , ) 
r the sequence S ,s convergent, we call ],m 5 the sum of the infinite series 
e W , converge, the senes ,s said to d, verge or to have no sum The 

i on h."s infinite seneT ^ ““ * arc “ lkd the 

to usc (6 ) an d m both f 0r ^ 

as t ,s undereto I .STS? 1 *”" ^ and r ° r the sum of < hls senes As long 
double use or thein P, C0n ^ er 8 ence a series must be established, the 

a workable definition Havm S obtamed 

would expect m ol a series, let us see if it gives results we 


ftanZ'atVnomtl^ ' 1 SeCt ’° n 5 '• we “"^ered the problem of 
each of which covered halfThe distant ’° ’ he n8ht 3 se ^ uence of slepS 
“P (Figure !5 fot r~ 8 '° ,he P0,m 2 The SUC - 

A ~ (l ' *' *’ *• ) = 0. i. (« 2 , (i) 3 , ) 

while the successive positions constitute the sequence 

5 “ (>• 3. i. ¥. ) = (2 - I, 2 - J, 2 _ (i )2, 2 _ (i)3, ) 

/ INFINITE SERIES 



The remainder of this section is devoted to proving theorems that are 
used in determining either convergence or divergence of a series These 
theorems are direct consequences of previous results obtained for sequences 
since there is a direct relation between the limit of a sequence and the sum 
of a series 

Theorem 2 If the senes 22 *= 1 an( * 22”= 1 b n both converge, then the 
senes 22*= i (°n 4- b n ) converges and the sums are related by 

22 (a„ + b n ) = 22 a n + 22 bn ► 

I n*=l n= 1 

Theorem 3 If the series 22*=i a n is convergent and if c is any real num- 
ber, then the series 22n= 1 ( ca n ) is convergent and 

23 (ca n ) = c 2 a n ► 

n=l n*=l 

The validity of both theorems follows directly from the definition of con- 
vergence of a series and Property 4 for sequences 

PROBLEMS 

2. Show by example that convergence of £n=i (fl» + b n ) does not imply 
convergence of either or b n ( Hint Try (a„) * 

( 1 ,- 1 , 1 ,- 1 , )) 

Example 4 Dropping of leaflets by air is a means sometimes used in ad- 
vertising to reach a large group of consumers The effectiveness of such an 
advertising effort is largely dependent upon the number of people that 
actually come in contact with the leaflets 
Suppose that it is assumed that a constant proportion X of the leaflets 
actually survive a given time period /, while the remainder are lost, destroyed, 
or otherwise rendered unreadable It is further assumed that each leaflet that 
survives until the nh time period will reach p people, on the average, during 
that period Given that N leaflets are dropped, how’ many people do we ex- 
pect to be reached by the leaflets ? 

If we assume that all of the leaflets dropped actually land on the target 
zone, then iV leaflets will survive time period t = 0 and will reach A r jS people. 
Of the N leaflets surviving time period t = 0, only N\ will survive period 
t = I and will reach people Of the N\ leaflets surviving time / = 1, 
only X(AT\) = N\ 2 will survive time / = 2 and will reach &N\ 2 people, and 
so on 

The total number of the populace reached by the air drop is, using Equa- 
tion (9) and Theorem 3, given b> the senes 

pn + 0 wx + pN\ 2 + ■ ■ ■ = 22 PH*' = 0A r 22 V - r r^~ 

1-0 1—0 * * 


62 CONVERGENCE / 129 



vices, total tncotne w.ll be further mcreased by St and so on tndefimtely 
assuming that the same proportion of income is spen' Thus _theon E 
spending of $1 leads to an induced expenditure of S(fc + k + + > 

expenditure of S. .ends to a total additional in- 

come in dollars of 

i + t + t 2 + - S *' = y-zr 


The sum 1/(1 - *.) is called the multiplier, a term first introduced by Koh^ 
in 1931 


PROBLEMS 

1 Find the value of the economic multiplier defined in Example 2 when 
(a) k « 0 05 (b) k = 0 1 (c) A. = \ (d) k - 0 9 

Our results concerning geometric series show that 22"*= o*" converges if 
and only if the sequence (x n ) converges to zero In general, it is necessary 
(but not sufficient) that in order for the sequence ( o n ) to generate a conver- 
gent series 22 "- i a n we must have lim (a n ) = 0 

Theorem 1 If 22»-i a n converges, then lim ( a n ) = 0 Equivalently, if 
lim (a n ) 0, then 22"= i a n diverges 

PROOF For each n > 1, a n = s n — u where (s„) is the sequence of 
partial sums defined by (8) Convergence of this sequence means that lim ( s„ ) 
exists and Property 3 for sequences guarantees that lim (s„) = hm (s„_i) 
Therefore, 

hm (n„) = hm (s„) — lim (r„_i) = 0 ► 

_£hat lim (a„) = 0 is not a sufficient condition for the convergence of 
22n=i a n is shown by the next example Here hm (a n ) « 0, but £n=i °n 
diverges 

Example 3 The sequence (l/n) generates the harmonic senes 

ii=i+* + i + i + 

We have already proved (Section 5 3, Examples 1 and 2) that lim (1/n) = 0 
but that the sequence (1 + i + + 1/n) of partial sums does not 

converge 

*°™/ 3 t, I t 73 -T 98 (TOl) 0 ” ° f H ° m ' lnves,m "' t to Unemployment The Economic 


128 / INFINITE SERIES 



and, in general, 

h*-i <1 + 2 + 4 ( -pj 

=I+i+i+ + 

_ i - aj 


+ 

i 

2i—i 


+ 2 s 


1 


(2t-l)2 


1 - 


< 2 


( 11 ) 


Hence the partial sums are bounded and the hyperharmonic series (10) con- 
verges when p = 2 ► 


PROBLEMS 

In Problems 3-10 write out the first four terms of each of the infinite senes 


3 V' n 

4 V" 2 

5 v i 2 ! 

■ h « 2 + 1 

~1 «(« + 2) 

.tj 3" 

6 

7 V _L 

^ 2n” 

s.E^ 
1=1 1 

9 E(- 1)" +I f„ 

n=0 Z 

io 

„_i »(» + 2) 



In Problems 11-14, calculate the first four partial sums of each of the given 
infinite series 

«. i + i + 1 + i + lzi-i + i — j + 

13.1+i+i + Jj+ 14 l-J + *-* + 

15 Find the sums of the following series 

wt^ cwEi 

n«*0 t—* 


(c) 1 - § + | - * + (d) -4 + * - - 

16 fCmtsch* in a model of choice behavior states that 


Z) 0 - bT~~b s n 


1 - (1 - A)r 


Verify his claim 

17. Why docs the gcomelnc series diverge when 

(a) r - r> (b) r - -I» 

'kintsch W , *A Response Time Model for Choice Behavior,* Psythomtuikct 28, 27-32 
(1963) 


62 CONVERGENCE / 131 



In this context, the quant.ty 0/(1 - X) is analogous to the multiplier m 
Example 2 

Theorem 4 Let («„) be a sequence of non-negative real numbers Then 
£” =1 o n converges if and only if the sequence S = (s„) of partial sums is 
bounded In this case 

52 o n - hm (s„) = lub (y„) 

n*l 

PROOF Since a n > 0, the sequence S is increasing The result is now a 
consequence of Property 5 for sequences ^ 


Example 5 The sequence (l/n p ) generates the hyperharmonic senes, some- 
times called the p series, 



1 + 5H + fr + i5 + 


(10) 


When p = 1, we have the harmonic series of Example 3, the partial sums of 
which are not bounded (see Example 2 in Section 5 3) When p < 1, we have 
n p < n or, equivalently, \/n v > \/n For p < 1, then, the klh partial sum 




of the hyperharmonic senes exceeds the corresponding partial sum 


of the harmonic series It follows that the sequence of partial sums (/*) is 
also unbounded and (Theorem 4) that the hyperharmonic series (10) diverges 
when p < 1 

When p > 1, the hyperharmonic series converges We prove this for 
P ~ l eav mg the general case, which is quite similar, as an exercise (Prob- 
lem 24) The trick is to find an upper bound for the sequence of partial sums, 
as required by Theorem 4 To do this we look at the subsequence of partial 
sums (r lf t 3 , / 7 , f 15 , / 31 , / 63 , ) whose subscripts are one less than a power 

of 2 We find 


'i = 1 

<» = 1 + p+i<i + i + i= i + 2 Q_) 

' 7 “ 1 + 22 + 35 + 43 + 55 + p+ ^ 

< 1 + ^ + ^ + ^ + *5+^ + 1= 1 + 2^) + 4 (T) 


130 / INFINITE SERIES 



(b) argue that knowing the series (10) converges for p = 2 automatic- 
ally means that it converges when p > 2 ( Hint When p > 2, 
n p >n 2 ) 

(c) prove that the hyperharmonic series converges for all p > 1 ( Hint 
Follow the proof for p — 2 step by step ) 

25. Prove that for any fixed integer m, 

Z flj = Z a, + Z a I 

j=l J=1 J=m-f 1 

(Hint Apply Property 4(i) for sequences Thus ^ZJLi a } converges if 
and only if a m — £*=»,+ 1 converges for every integer m = 0, 1, 2, 

3, ) 

26. Let <r m be defined as in Problem 25 Prove that the series a J con- 
verges if and only if the sequence (ao, cri, a 2 , ) converges to zero 

Thus while convergence to zero of the sequence (a„) of individual terms 
is not sufficient to guarantee convergence of 2”=i convergence to 
zero of the sequence ( o m ) of * tail series” is sufficient 

27. Friedman* has proposed that permanent income, defined as the income 
considered to be normal by a household, be estimated by 

(1 - c) f; c’y, (12) 

T = 0 

where y T is the observed income v time periods ago and c is a constant 
between 0 and 1 Thus, according to Friedman’s concept, permanent 
income depends on all previous incomes with the weight given to past 
incomes decreasing geometrically over time 
Assume that Y — lub (y„) is the largest income previously attained 
Show that the sum of the senes (12) cannot exceed Y 


6.3 BASES FOR THE REAL NUMBER SYSTEM 

An interval is a set of points lying between two points on a line If a and b 
ore real numbers with a < b, there are four possible intervals which may be 
formed using points between tr and b , depending on whether a or b each is 
or is not included in the interval The four intervals (x a < x < b} t 
{x a < x < b), {r a < x < b}, and (x a < x < b) arc, respectively, 
denoted by [a, b ), ( a , b ], [a, b ), and (a, b ) The bracket indicates that an 
endpoint is included in the interval while a parenthesis indicates that the 
endpoint is not included The notation (a, b) unfortunately is the same as 
that used for ordered pairs, but it will alwajs be clear from the context 
which meaning is intended 

*rYi«lman M , A Theory of the Consumption Function (National Bureau of Economic 
Research Nw York, 1957) 


63 REAL NUMBER SYSTEM / 133 



18 In Problem 15, Section 6 1, what amount will Mr Franklin receive if he 
wishes to continue to receive payments indefinitely 9 Check against the 
result obtained bv using P = TT-i P } in Example 2, Section 6 1 

19. Prove that 

lim ^ x = hm x when \x] < I 

n -.os r— 0 n— too r=0 

20 Suppose that an attitude is considered as a single dichotomous attribute 
That is, during any given time interval /, an individual can be in one of 
two states either he is favorably disposed towards the issue in question 
(state 1) or he is opposed to the issue (state 2) 

Depending upon the assumptions, a change in attitude may be repre- 
sented by a variety of mathematical models One such model is as follows 
Given a population of n individuals, it is assumed that during any given 
time period t a constant proportion a of the individuals in state 1 at the 
beginning of the period change to state 2, and a constant proportion /3 
of those individuals in state 2 at the beginning of the period change to 
state 1 If we let nu be the number of individuals in state 1 at the begin- 
ning of the fth time period and « 2 i be the number of individuals in state 
2 at the beginning of the /th time period, then 


"1‘ = "1(1-1) — «rtl<f-l) + /3«2(f_l) 

= "1(1-1) — «"l(t-l) -f /3(« — 

= «i(i-i)(l - a — 0) + p n 

Show that as t — > oo, the number of individuals with favorable atti- 
tudes approaches the equilibrium 


regardless ot the number initially in state 1 (Hint Write nn m terms 
“ me 2 , and Do you see a pattern’ Use induction and then 

apply Problem 19 ) 

* ]C a 81ven series and let be a new series whose terms are 
„ same as those in|>„ except that those for which o, - 0 have been 
muted Prove that converges to a number A if and only if 2>» 
converges to A 

frhTno* 131 , converg ^ nce of a series is not affected by changing or omitting 
changed) * fimtC number of terms (Naturally, the sum may be 

the terms of a convergent series by inserting paren 
series M tT n ° , a ter the sum However, grouping terms in a divergent 
COnVergence Af ^r grouping, the new se 

q nce of P art,al sums is a subsequence of the old ) 

24 Concerning the hyperharmonic series of Example 5 
(a) establish Formula ( 11 ) by induction 
132 / Infinite SERIES 



jSo^ 3< 2 + A + T5t>. 
4 < 2 + to + T§o 1000 
decimal expansion is 


and n 3 = 3, since 2 + A + T§o + 1000 ^ 
. Continuing in this way, we conclude that the 


4 = 2.333 . . . 


FIGURE 4 




-+- 


—I t-t 

o — 2 — 
•*10 * io 


— f— 
2m 


^To ^16 Z 10 Z «0 

To check that our expansion is correct, note that 2.333 ... may be re- 
written as the geometric series 

_ ^ 

10 , 


V' 3 0 4--?. 

2+E iot = 2+ lG t _ 


Formula (9) shows that this sum is 


2 + 


£ (*») 

3/10 _ 


10 1 


(1/10) 2 + 9/1° 


= 24 = 5 


as required. 

By partitioning each interval into Pro new intervals rather than 10, we 
generate the binary expansion of any positive numb . 

. . i 2 jr obtained as follows. First, we 

Example 2 The binary expan 3 Figure 5. Next we partition the 

take a 0 = 1, since 1 < 1§ < 2 as shown s and + i 2) since 1§ 
interval [1, 2) into two equal mttrv* [ ^ rtition again to find 

lies in the interval [1 + £ . 2 )> we la ^ e 1 
1§ in the interval 


[1 + i, 1 + 4 + « - I' + i + 1 + i + 


FIGURE 5 


so that n. = 0. Partitioning this interval we find 15 in 

[l+i + S + 4- 1 + i + S + i) 

so and so forth. Continuing in this way we find the binary (base 2) 

expansion of 1 3 is 

, , ij.9,4-A + - b 1. 101010... 

l + 4 + S + i + ' li: + :f3 + 

6J REAL NUMBER 


SYSTEM / 135 



Corresponding to any positive real number x there exists an interval 
[a 0 , flo + 1)> where a 0 1S an integer, which contains x Since the Archi- 
medean property (Section 4 3) guarantees that some integer is greater than 
x, we simply take a 0 + 1 to be the smallest such integer (Figure 2) 

FIGURE 2 


— I 1 h— 

fl o Oo + I 

Now suppose we partition the interval [a 0 , flo + I) into 10 smaller in- 
tervals [flo, fl 0 + *), [o 0 + *, «o + M , [a 0 + -ft. flo + 0 as 
shown in Figure 3 Then x lies in exactly one of these intervals, say 

[ O0 + 2i, O0 + m-t_L) 

FIGURE 3 


"l-H 


— H 


°° °» + IS “0+S “o+^ir ••• co+I 

We then partition this nei vat into 10 equal intervals 


(° 0+ K ^To + TSo)’ 

[ ao+ w + TSo’^ + ig + ilo)- ’ 


[““ + T5 


’«0 + 


2l±_L\ 
10 ) 


Again x lies in exactly one of these intervals, say 


[-+a+£-«.+a+!&i). 


10 T too * To ’’ ioo _ / 

In this way we generate a monotone increasing sequence of points 

C “°’ ^ “ (°°- + T5 ’ a ° + T5 + ■ ) 

which converges to x (Why t) Th,s procedure generates the decimal expansion 


= °0 + 7* + 


, «2 


10 -r 100 + 1000 


: flo tiin 2 n 3 


Example 1 L e t x _ 7 X r ^ 

ure 4 Next = 3 , since 2^ <V< 1 < 3 , aS Shown m 3 F,g_ 

^ — 3 ^ ^io Then n 2 = 3, since 2 -f- ^ + 

134 / INFINITE SERIES 




binary arithmetic are 


100100 
■f 111110 
1100010 


1001011 
X 1101 
I00I011 
1001011 
1001011 
imooim 


PROBLEMS 

3 Perform the following operations of binary arithmetic What decimal 
numbers are you working with 


(a) 

101110 1001 

(b) 

10101101110 


+ 111001 0110 


- 1110001 0010 

(c) 

no mot 

(d) 

111 00 on 


X 10001 I 


X 0 11 


(e) 1101 1011 — 100 100 (to three binary places) 

(0 0 0110001 — 1 01101 (to five significant decimal places) 


4. Add and f by first writing the binary expansion of each and then 
summing 

Octal numbers are numbers to the base 8 For example, the decimal (base 
10) number 1342 75 can be expressed as 

(2 8 3 ) + (4 8 2 ) + (7 8 1 ) + (6 8°) + (6 8" 1 ) 
which is the octal number 2476 6 

PROBLEMS 

5 Convert the following decimal numbers to octal and binary form 

(a) 24321 (b) 613428 3 (c) 318 625 

(d) 4321 068359375 (e) V' 

6 Construct octal addition and multiplication tables like the binary 
tables in Table 1 

7. Perform the following arithmetic in the octal system 


(a) 240176 

(b) 

32176 176 

4- 161125 


+ 11715 017 

(c) 351 651 

(d) 

2714 076 

- 276316 


- 3276 037 

(e) 210 31 

TO 

0 07761 


X 27 07 X 0 172 


(g) 4301 273 (to three octal places) 

(h) 22 135 -h 000243 (to five significant figures) 


63 REAL NUMBER SYSTEM / 137 



To check on results, we sum the geometric series 

1 + 2 2 2 *-i “ 1+22^5- 1+4 £ 41-1 

1=1 z i=l *~t 

Again applying Formula (9) gives 


► 


PROBLEMS 

1 Find the ternary (base 3) expansion of f Check your result by summing 
a geometric senes 


The method for writing representations of numbers in any base is essen- 
tially the same as the base 10 Thus 74 5 in base 10 means 

(7 X 10') + (4 x 10°) + (5 x 1CT 1 ) = (7 X 10) + (4 X 1) + (5 X A) 
But in base 2 this becomes 

1001010 1 = (1 x 2°) + (0 X 2 5 ) + (0 x 2') + (I X 2 3 ) + (0 X 2 2 ) 

+ (1 X 2‘) + (0 X 2°) + (1 X 2" 1 ) 

“OX 64 ) + (0 x 32) + (0 X 16) + (1 X 8) + (0 X 4) 

+ (1 X 2) + (0 X 1) + (1 X l) 

PROBLEMS 

2. Write 74 5 in base 3, in base 5 


tivelt Lmnll i", °„ nary ™ mbers Wllhm a "X computing machine is rela- 
maI ” L A m :t 8e T * e “ hcr h ' 6h or A location ,n a 

be oncn S p P h “ o»c or the other direction A switch may 

™ - b,nary 

ber as in the above example d rEpreSent a blnaiy nUm ' 

hon and multiplication t^kTrhraH^ r |; qUlreS ° n ‘ y four - clement addl " 
p anon tames these are shown in Table 1 Examples of 

TABLE 1 Binary Arithmetic Tables 


+ 0 1 

0 0 1 

1 1 10 


X 0 1 

000 

1 0 1 


136 / INFINITE SERIES 



(a) The series 


l + ^? + ^ + ... + _^L + . 

2 ^ 3 ^ 4 ^ + « + 1 + 


diverges since \/n/(n -f 1) > I/(n -f 1) and £” =0 1/C” *f 1) is 
the divergent harmonic series. 

(b) For every n > 1, 1 /n(n + l)(n + 2) < 1/n 3 . Hence the series 


1 ■ 2 • 3 + 2 • 3 • 4 + 3 • 4 • 5 + " ' + n(n + l)(n + 2) + ' 


converges. 

(c) The series 

3,45 _« + 2 ... 

1 • 5 ^ 2 • 6 ^ 3 • 7 ^ ^ n(n + 4) T 

diverges since (n + 2)/[n(n + 4)] > (1/2)(I /«) and 1/” 

diverges. 

(d) If n is a positive integer, the product 1 • 2 • 3 • * • n of the integers 1 
through n is called n-factorial and denoted n\. It is easily seen that for 
” "> 4, n] > 2 n or 1/n! < 1/2". Hence the series 


t 


n! 


1 + l + 5! + ii + ' 


converges since 



‘+2 + 



is a convergent geometric series. 


► 


Theorem 6 Ratio Test for Convergence Let (a n ) be a sequence of posi- 
tive real numbers and suppose that there is a number r < 1 and an integer 
N such that 

< r when n > jV 
a n ~ 

Tl»cn o n converges. 

if, on the other hand, there exists a number r > 1 and an integer jV such 
that 

> r when n > N 

On 

then a H is divergent. 

PHOOF As in the proof of Theorem 5 we nuy take A' •» I. Then in the 
hrst case with r < I, wc have 

a; < rat 

flj < ra? < r~a\ 


e< TCCHWiQucs ron summing / tss 



Many computers are now using numbers to the base 16 which are termed 
hexadecimal numbers A hexadecimal digit is called a byte and is equivalent 
to four bits Memory storage in a computer consists of individual words 
(locations) each of which will hold four bytes 

PROBLEMS 

8 Convert the following decimal numbers to their hexadecimal equivalents 
Usea = 10,0 = 11,7 = 12, « = 13, t = 14, and r = 15 

(a) 3480 (b) 29146 75 

(c) 3 (d) I 

9 Convert your answers in Problem 3 to hexadecimals 

10 What is the largest decimal number that can be stored in one hexadecimal 
word 7 


6 4 TECHNIQUES FOR SUMMING INFINITE SERIES 

In working with a particular senes, one is normally interested in finding its 
sum Unfortunately, this is rarely possible, the geometric series being an 
exceptional case Usually, recourse must be had to methods of approximating 
the sum In any case, there is no point m trying to sum a divergent series 
(which, of course, has no sum) and so one begins by determining whether 
the senes in question actually converges Most or the techniques for check- 
ing convergence apply to series having only positive terms 

Lo?,Z.T 5 r Compamm Te *'M Convergence Let ( 0 „) and (6„) be 

there ,s a rv- ve 7 

implies convergence of V" „ c 7 he " convCT B="“ °f b - 

implies divergence of T.'t'X " Eq alently - dlver S l:nCi: ° f £■->“■ 

bTomrneiflnW "Ir 22 ,^ Se n ,,0n 6 * ,he aad ft. for n<N may 

assume that N = 1 “(IwgLlT V S ““ ° f e,ther serles We may 
Of partial sums 8 '^ 7,= 1 &n means that the sequence ( /* ) 

tk = bi + b 2 + -f b k 

02 + + * * «i follows from a. < K 

converge^ 1113 SUmS ° f Qn are bounded, and by Theorem 4, X^-i * 


a^rhypeZmon 051 C0 ”P a ™°" series are the geometric 

«hat rr „?™ ": r l (E ? Uat ? n (9 > a ” d Era >"Pl= 5 of Section 6 2) 
if and only if p > l " S 1 an 011 y lf W < >. while £”_i l/ti” converges 


138 / INFINITE SERIES 



senes itself However, it ,s not possible for a series to be absolutely conver- 
gent and not convergent 

Theorem 7 If a senes converges absolutely, then it converges 
PROOF Suppose E:=i W = «nd define a sequence (W by 

= fl, + w (13) 

Then 6„ = 0 if is negative and 6„ = 2|*| if positive In any case, 

0 <b n < 2\a„\ 

, rV" h are bounded by 2A Theorem 4 guarantees 

and the partial sums of 2-,»- 1 are Doun 3 
that E”_i K converges 
From (13), we have 

± a n = ± b n - t l«.l 

This being the difference of two convergent senes, it follows that i 
also converges This completes the prool 

Example 3 (a) Since 


and E:_ 1 1/n 3 is a convergent hyperharmomc senes, the senes 

1 ,_J L_ + 

' “ 2 3 V2 + 3 V3 4 3 V4 

converges absolutely (Theorem 5) and hence must converge (Theorem 7) 
(b) If in Example 2 we apply the rat.o test to the senes 


ni + U 4 1 


+ 3^5 + + |S(r+T)l + 


^ n unnecessary and that the series con- 
«c find that the — t. ,» > » » and 1) It also con- 

: C c r r E g : S s 'Iforxl but diverges when |x| > . ► 

Once comergence of a senes has been 
that of finding, or approx.ma '"E- ^ mos( obs , ous appr0 .,ch is simply 

of partial sums converges to the . technical resources 

to add as mnn> terms as is practicable in terms o. 

(calculators, computers, man vantage that it goes no indication 

efSrSl part. .1 sum approximates the true sum W„h- 

«4 TECHNIOUCS FOR SOMHINO / 1« 



and, in general, 


Qi 

1 - r 


a n < r n l a x 

The comparison test (Theorem 5) shows that 
£ °n < a x r n ~ l -- 
In case r > 1, we have 

a 2 > ra x > a x 
«3 > fa 2 > a 2 > a x 

and, in general, 

a n > a i 

Since oj > 0, the sequence (a„) cannot converge to zero and hence X]". 1 ft. 
cannot converge (Theorem 1) ^ 

Example 2 In the senes 

t + 


V J 

+ ^ + 


1 3 T ! 4- f '3 5 t 
suppose x > 0 and let a„ = x*/n(n + 2) Then 


q n+l _ 




"(» + 2 ) 


1 + (2/n) 


so that 


** ^ r + T >^+ 1 ) “ *PT (l/n)][l + (3/n)] 


hm 2=±> 

n—.oo O n 


1> there exists a numher r nu.u * 

of a limit ensures that thpr^ . suc ^ l ^ at x < r < 1 and the definition 
« * » < r when 

and a^ntegeVZShXT 1 7"? ^ ra ' Stenre ° f a ”“*» ' > 1 
the senes diverges ” +,/ °" — r w hen n> M In this case, then, 

converges since l/n(n + Td V" 5 ' 7 ' 1 ’ bU " “ thls case the ser,eS 

n -^ n =i 0/« 2 ) converges ► 

tri^cin^he^onc^of^bsol'ute convergence 11 ^^ ‘° ™ by 

Definition 1 A series V® 

Hn=i |o„| converges " =1 ° n lS a ° s °fately convergent if the series 

or the associated senlsofaSotae tbe actual conver S eIKe 

> rather than on convergence of the 

110 / infinite series 



Since Jim*-,*, (tfi+i) = 0, it follows that the subsequence of partial sums 
containing an odd number of terms also converges to L Hence L ~ i a n 
and the original series is convergent This completes the proof )► 

A useful property of convergent alternating senes is that the magnitude of 
the error made by using the sum of the first k terms as an approximation to 
the sum of the senes is less than the magnitude of the (k -f l)st term If k 
is even, 

«o k 

2 a n — 2 a n — fffc +1 — (fffc+2 Ok+z) ~ ( a k + 4 ” a k+ 5 ) — 
r * — 1 rts=l 

= (ok+i ~ n , -t+ 2 ) + (01+3 ~ Ok+ 4) + ( 1 ®) 

so that this difference is positive but does not exceed ak+i If k is odd, 
2»«i On — 2««.i n» has the same representations as (16) above In either 
case, then, 

2 °n — 2 < a k+i 

I ri~ 1 n= 1 ) 


Example 4 Assume that we wish to compute the value, correct to two 
significant figures, of the sum of the senes 

i - i+i- i+ 

which is convergent by the alternating series test 
Since 1/101 « 0 0090, we obtain the sum correct to two significant figures 
if we take the sum of the first 100 terms as an approximation to the sum ► 


PROBLEMS 

In Problems 1-9, test the given series for convergence by use of the com- 
parison lest (Theorem 5) or the alternating senes test (Theorem 8) 


2 2 2 

2. ~=z + ~~=z + - 7 = + 

VI 2 V23 V34 

+ + <rl>— 

2 5 T 10 17 n? + 1 


Vn(/i + 1) 


4 - l+ -L-A. + -L + 
V5 V7 


H , ~ *r 

V2n -1 


3 + 1 


4+1 5+1 

‘ 42 - ! + 5" - 1 


* 2 


vs 


- I 


64 TECHNIQUES FOR SUMMING / 143 



out some additional information, two types of errors may be committed 
(1) An insufficient number of terms may be added resulting in a poor approx- 
imation, or (2) many more terms than necessary may be included, resulting 
in an unnecessary waste of time and resources What is needed is an estimate 
of the error associated with using a partial sum to approximate the sum 
Such an estimate may be derived from the conditions of Theorem 6 For 
instance, suppose N = 1, a k = 11, and r = \ in that theorem Then we 
have a n < r n ~ x a k for every n so that 


t i = II 

n=fc-fl n=fc+l 1 r 



Thus the error made in using the partial sum $10 = 01 + 02 + -fflio 
to approximate 2»=i cannot exceed 11/512 « 0 02 This error is small 
enough so that for most practical purposes this approximation suffices The 
use of 20 terms approximates the true sum to within 0 00002 


Definition 2 A series whose terms are alternately positive and negative is 
called an alternating series and may be written in the form 

«i — o 2 + a 3 — a 4 + a 5 — a 6 + 

where each a n > 0 ^ 


Theorem 8 Alternating Series Test for Convergence Suppose (at, a%, 
« 3 . ) is a decreasing sequence of positive numbers which converges to 

zero Then the alternating series 

a i ~ a 2 + — a 4 -f- 

must converge 

If A. is evenThen 6 ° n+1 ~ ° n ^ 0r n anc * ^at (°n) converges to zero 


£ ° n ~ °i ~ («2 - a 3 ) — (a 4 — o 5 ) - 
and 


• (a k _ 2 - a k -i ) — a k ( 14 ) 


(15) 


( “' ~ “ 2) + ( ° 3 - °<) + + (a,-, - at ) 

“ that y**"' eac ^ P arent b«is in (14) and (15) is non negative From (14) 

su bse q ifence 


Since a n 

"c see that ' 

* ‘ *—,*>— i ^ ui miu in 

cvcn * K + 1 ‘S odd and we can write 


*+t t 


a *+x 


142 / INFINITE SERI 



is -l + H+ 


1 3 5 
3 6 9 


+ 


1 3 5-7 
3 6 9 12 


+ 


16. £ 

k = 1 


k + 3 


17 - £ 


6 l+I k 3 (-lf +1 
8 * 


38. Show that the senes j [(— 1)"/\A?] converges but that 


t 


(- 1 )” 

sT„ 


does not 

If a series ^a n converges but £l«nl does not, then 1S said to be 
conditionally convergent Check the senes m Problems 19-22 for con- 
vergence Do any of the senes converge conditionally 9 


«-f-s + f- # + 

20. i - i + \ - i + 


21. 1 


2 3 3 3 

2 1 + 3 2 1 


4 3 

4 3 2 1 


+ 


22. 1 - — 4- _ 
4 ~ 9 


16 


23. Suppose we are interested in studying the effects of knowledge of past 
results on a subject’s judgment regarding the weight of a stimulus object 
We postulate an “ideal” subject who behaves as follows 
(i) He alternately overestimates and underestimates the actual weight 
(n) Each overestimate is a constant proportion X of the error (under- 
estimate) on the immediately preceding trial 
(m) Each underestimate is a constant proportion 0 of the error (over- 
estimate) on the preceding tnal 

(iv) His initial error (trial zero) is d and he overestimates on tnal 1 


(a) Find the magnitude of the estimation error for each trial 

(b) Assuming that X and 0 are both between 0 and 1, find the sum of the 

magnitudes determined in (a) ( Hint First prove that the series is 

convergent, then insert parentheses so as to group together successive 
pairs of terms ) 

(c) Find the sum of the absolute magnitudes of the errors 

24. In Problem 23, suppose that we modif> the behavior of our “ideal” sub- 
ject by assuming that the X and /! values vary from trial to tnal Specif- 
ically assume that the X value on tnal 2n — 1 is 1/(2 n — 1) and the/? 
value on trial 2/r is 1/2 n, n - 1, 2, 3, 

(n) Tind the magnitude of the estimation error for each tnal 
(b) Prove that the alternating senes is convtrgenl 


64 TECHNIQUES FOR SUMMING / 145 



+ ... 




(-D B 


»-2(«+ !)«(« “ 1) 

8 - 1 - 55 + ^ H 




22 (n-l) 


+ ... 


■G-^y + tu ,+ w 

associate learning 

S(-b)> 


10. In a model of paired-associate learning. Bower* uses the series 

u-i 


j (1 - c) 

Show that the sum of this senes is (l/c)[i — (1/A/)] 


11. In Problem 10, 

(a) show that 

sK 1 2 

i 

i 

o. 

converges 

(b) show that 


r, __l> 

^ M) 

1 (1 - cf 

converges. 

(c) Find 

Si' 

\ _ I> 

V M) 

\ (1 - c) : 


12 ' a ' “ ‘J 1 ' re ' a,IVe frequenc J' P- of »th most frequent reply to 

a word association test is 


Show that 

Test each of the senes 
(Theorem 6) 


2n ~ 1 2/7+1 

Pn converges 

m Problems 13-17 for convergence by the ratio test 


13. 


1 2 3 


+ — - + - * 
2-3-4^3T+5 


+ ... 


1 v" t a 5 

14. x + I£_ + Ii2£_ 

2 3 + 2-4 5 + 


26, 255-280 (1961) PPl,Cat,0n ° f & Model to Paired-Associate Learning,” Psychometrika 
Association W1 ‘ h ^ 

144 / INFINITE SERIES 



1 , 1 3 , 1 
15 ' 3 + 3 • 6 + 3 


u 1 


3 5 7 


1P9 + 3 6 9 12 


16. £ 

fc = 1 

17. ± 

1 


fc+3 

6 t+ V(-l) t+1 
8 fc 


18 Show that the senes £Ll [(-1)“/V"1 converges but that 

(- 1 )' 

V « 

does not V'l/i I Hoes not then is said to be 

If a senes £o„ converg “ ^ ^ ;' eries m Problems 19-22 for con- 
cortditionally convergent Ch condlIlona ilyt 

vergence Do any of the series convene >• 



m studying the effects of knowledge or past 
23 Suppose we are interested in st “ J ® ‘ the welght 0 f a stimulus object 
results on a subject's judgment regarding in b 
We nnstulate an "ideal” subject who behaves as tollows 

(0 H. Ml ."1 " l 

estimate) on the immediately preceding trial 

0,0 Each underestimate ,s a constant proportion „ of the error (over- 
estimate, on the P-eding trial ^ ^ ^ , 

(iv) His initial error ttnal zero) is <1 an 

(a) Find the magnitude of the estimation error for each trial 

comcrgenU fh“ parentheses so as to group together successive 

pairs of terms ) 

(c) r,nd the sum or the absolute magnitudes or the errors 

in Problem 23. suppose that vie 

S!rL‘ t X on trial 2. - • « ~ 

value on trial 2 n is 1/2 n . « - 1, 2, 3, 

(a) Find the magnitude of the estimation error for each 

(b) Prove that the alternating senes is convergent 


e 4 techniques Eon suMitlie 


' 1*5 



(c) Estimate the error made by taking the sum of the first eight terms 
of the given alternating senes as an approximation to the sum of the 
error magnitudes determined m (a) 

25 Prove that if the series £”_i a k and b t are absolutely convergent, 
then so are the series £“ = i («* ± b k ) and Z)*-i ca * f° r an y constant c 
Thus absolutely convergent senes can be combined in the same way as 
ordinary finite sums 

26 Show that if a convergent series of numbers contains either a finite num- 
ber of negative terms or a finite number of positive terms, then it is ab- 
solutely convergent 

27. (a) Show that the Ratio Test for Convergence (Theorem 6) implies the 
foltowmg result Let (<*„) be a sequence of positive numbers and 
suppose that r «= lim (a n +i/a„) exists Then £"-i a n is absolutely 
convergent if r < 1 and divergent if r > 1 
(b) Apply the result of (a) to the geometric series 


6 5 DOUBLE SUMS AND DOUBLE SERIES 

In many contexts, numbers are naturally arranged in rows and columns of 
a table forming a rectangular pattern, thus 

*11 *12 *13 *ln 

*21 *22 *23 *2n 


ncwitmtwr ” * 1 ™ mbcr of arranged in m rows and n columns The 
or rrX d ^ ?, PaniCU,ar * < > uantlt >' 15 indicated by the double sub- 
while k rano e f varla tJ ran ges from 1 to m and indicates the row number, 
ts the efeme r w" fc 'S nato ,hc a PP™P™te column Thus 
row S) S and column 2 ” W a " d SeVe " th co1 ™". "*>"= * * “ 

If we wish to add all m X n of the we could first obtam the row totals 

*' = £ J?2 = £ 

and then add these numbers to obtam the grand total 

t=£r, 

3=1 

ince R, = IX ! x ]k , T may be rewritten as the double sum 

r " £(£**) 


116 / infinite SERIES 



If, on the other hand, we obtain column totals first, we find 
T = t, (f, x lk ) 

*=i S=i / 

Since the total T is the same in either case, we have established the following 
rule. 

Rule for Interchange of Order of Summation 

S = S ► 

k=l }=1 

Thus we see that the order of summation is immaterial when the limits of 
summation do not depend on the variables of summation j and k. (Sums 
with variable limits of summation are considered in Problem 2.) 

Example 1 Suppose that x,k is always equal to k — /. Then 

Z x, k = D (fc - j) 

= (k - 1) + (k - 2) + (k - 3) + (k - 4) = 4k - 10 

Hence 

E E (fc - j) = i: (4k - io) = 4 (i; k) - s(io) 

*-t j»i i»i / 

= 4(15) - 5(10) = 60 - 50 = 10 
Interchanging the order of summation gives 

E E (k - j) = E 05 - 5j) = 4(15) - 5 (i) j) 

i—l k — i J-I Sj-i 4 

= 4(15) - 5(10) =10 ► 

PROBLEMS 

1. Rewrite the following expressions without using summation notation. 

(a) £ JL Xu < b > £ £ tTZTI 

i — l /-I 1-1 /-2 - J 

(c) 5 Z £ <<*) £ 2 

/— i »-i /-e 1-J “ J 

U!mt Note that interchanging the order of summation makes no 
difference.) 


65 DOUBLE SUMS AND DOUBLE SEftlES / 147 



Double series constitutes the infinite counterpart to double sums. A 
double series is generated by a doubly infinite array of numbers (a double 
sequence) such as 


a n 

a 12 

**13 ♦« 

**1* 

**21 

**22 

**23 • ■ 

. **2Jfc 

**31 

**32 

**33 • 

**31 

**>i 

**j2 

**j3 • 

o,k 


As above, the symbol a 3 k represents the element in the yth row and &th 
column of the array 

Each row in the array (17) generates an infinite series Symbolically, the 
sum of the elements in the jth row is 

*,= £ (i®) 

>.i 

This sequence of row sums itself generates a senes 23”“ i 2?, which, on 
inserting the series (18) for R„ becomes the double senes 

E Ei,i (19) 

Summing the elements in the separate columns first gives 

c * - E “rt (20) 

in the “ coiumn ° r (n) The s — <*> een - 

E c t = 2 f; °,t (2i) 

* =l fc=l J= i 

erated byte dm S ^ <?> arC Ca,fcd th = double senes gen- 

cesesmvoTv'd™ SCq “ enC = (,7) of the successive limiting pro- 

01)1.7' ^ , T PU n g 7 (,8) and ,h =" <>9). <* first (20) and then 

indeed, they mav not 7- p\!i ese 1,e r ate d senes have the same value and, 
will involve onlv ah* l t i 10 Cm ^ fortunately, however, our discussions 
senes V" °'fo bs ° lut 'J' convergent double senes To say that an iterated 
double senes V" 1 < V“ co [ lve [ ge ^ a bsolutely” means that the associated 
, " =1 ^- ,k=l °* absolute values converges In this case, 

148 / IN FINITE SERIFS 



interchanging the order of summation still produces the same sum. The 
following theorem, which we state without proof, gives the details. 

Theorem 9 If either of the double series (19) or (21) converges absolutely, 
then 

(a) it converges. 

(b) the other series converges absolutely. 

(c) the two series have the same sum. ► 


Example 2 If for each pair of non-negative integers j and k we have 
** = then 


Noting that (^) J is constant relative to the inner summation (in which k is 
the summation variable) the double sum becomes 


S©'£(0‘-£(l)'(rh) 


i-4 i-4 


Example 3 Let a,i - (JFCJ) 1 when k < j, and a lt ~ 0 when k > j. For 
example, a 42 — (£)*( J) 2 , but a 2< — 0. Then 

E E»,.= E E(4)'(4)* 

j— 0 k—Q j-0 A— 0 


Summing first on k gives 



1 - 
1 “ l 


which may be written in two separate sums as 

rhSO'-rhS©'© 


The first of these sums has the value 3, as in Example 2, while the second 
sum is 

L_ 1 f- (i iV _ _ S _J_ 3 

I - V 3 ' 1 - i 5 

Tlic total is 3 — 5 “ ► 


es DOUBLE SUMS AND OOUDLC SCRIES / 149 



problems 

2. When limits of summation involve the summation variables one must be 
careful when interchanging the order in double sums. 

(a) By writing out all the terms, show that 

3 fc 3 3 

52 52 ~ Z) ^ 

i=i 3=1 k =J 

Why is 

3 k 

i=i j=i 

not equal to 

52 52 x ki 

j-i 1=1 

(b) Show that 

t t«r + D-t E«r + o 

1-1 t= 1 r-V Jc-r 


3. Find the sum 


£ 



4. By summing each double series separately, show that 

£ £** +, = £ £* ,+ ' 

1—0 ,=l J-O 1=0 


{Him Let a Jk « 0,y < k and a,* *= Jc fc+, ,y > k ) 

5. Consider the doubly infinite array of numbers 


1 -1 0 0 0 0 . . 

0 1 -1 0 0 0 ... 

0 0 1-1 o 0 ... 

0 0 0 1 -! 0 . . 


Tor any natural number m, the mth row has 1 and -1, respectively, in 
the mth and (m + l)st positions All other entries are zero 

(a) Show that for each m the sum of the elements in the mth row is 

R* - 52 a-- - 0 

i-i 


ISO / INFINITE SERIES 



Hence, 


2 - 2 2 tfmr. “ 0 

net nt=l n=l 

(b) On the other hand, if C n denotes the sum of the series generated by 
the elements in the nth column, show that 

Ci » 1 and C n — 0 for all /? > 2 

Thus, 

2 Gt =* 2 2 Omn - 1 

ti =1 »=1 m = l 

Interchanging the order of summation in a double series can make a 
difference in the value of the sum obtained! Note that neither iter- 
ated series is absolutely convergent. 


SUPPLEMENTARY READING 

Hirchman, 1. 1., Infinite Series (Holt, Rinehart and Winston, New York, 1962). 
Rainville, E. D., Infinite Series (Macmillan and Company, New York, 1967). 


SUPPLEMENTARY READING / 151 



COMBINATORIAL ANALYSIS 7 


7.1 TWO BASIC PRINCIPLES OF COUNTING 

!1°S H f de ‘T m ; g the number of dements m a set arises so often 
Ind mLr c Wl . ^ Chaptw develo P ,n S techniques for attacking this 
two nnncinles or re ate f proderas 0ur a PP r °ach is based on the following 

denote the number^elements 1“™“’ ^ Symb °’ " (A) “ ‘° 

CP 1 If A and B are disjoint sets, then 

n(rt U fl) = n(/t) + „(B) (1) 

CP 2 If A and B are any two sets, then the product 
"04) n(B) 

f-h can be formed by 
the set B e set ^ am * the second element b from 

^^rsets^ ^Tb IfTliZtdAXB R^ 11 ^ th£ CarteSmn P r0duCt 

tion 3 1 ) X. B Review Problems 5-1 L Sec- 


X B Review Problems 5-11, Sec- 

PROOF OF CP i We t la 

nr either A or B, the sum n(A) + n(nT eVWy e,ement o! A U B is a member 
nnce On the other hand, since .4 and ^ e,ement ,n A u B at kast 
demem is counted in both n(A) and nlB'l tT C "c clemenls m common, no 
since it counts each element in A 1 1 n JJ!!" 3 ' Formu,a (0 must be correct 
lj exactly once 

1H 1 COMBI '‘ATORIAL ANA LVStS 



proof OF cp 2 Suppose A has m elements a iy a 2 , , a m and B has n 

elements by, b 2 , , b n The element o\ may be paired with each of the 

elements b% t b 2 , , b n w turn, making a total of n possible pairs with 

in first position In the same way, we see that a 2 is the first element in n pairs, 
as are a 3 , a 4 , etc Altogether, then, there are 

R + n + n -p +n = mn 

pairs ► 

The proof of CP I indicates the criterion which should always be applied 
to counting problems 

Counting Criterion (CC) The number of elements in a set has been 
correctly determined if 

(a) no element has been omitted in the counting process 

(b) no element has been counted more than once ► 

Example 1 If there are 17 alternate shipping routes from New York City 
to Chicago and 12 routes from Chicago to Salt Lake City, then there are 
17 X 12 *= 204 routes from NYC to SLC which go through Chicago Here 
a different complete route is obtained for each different pair of routes, one 
from NYC to Chicago and the other from Chicago to SLC According to 
CP 2, each of these is counted exactly once in our computation ► 


Example 2 Robinson* has reported that in a randomly selected sample of 
40 Democrats reelected to the Eighty fifth Congress, 21 disapproved of, 16 
moderately approved of, and 3 strongly agreed with the policies of the De- 
partment of State headed by a Republican Secretary Hence, there were 
16 + 3 = 19 Democratic Congressmen who either moderately approved 
of or strongly agreed with State Department policies The conclusion was 
reached by an application of CP 1 ► 


CP 2 is often rephrased in terms of a job which consists of performing a 
succession of tasks If there are m ways of performing the first task and if, 
no matter how the first task is performed, there arc then n ways of performing 
the second task, then there are m X tt ways of performing the tasks in order 


Example 3 From a squadron of 50 members the commander chooses the 
member whom he would most like to have fly support for him and also the 
member whom he would least like to have There are 50 X 49 =* 2450 wajs 
m which this maj be done ► 

•Robinson J A , * Process Satisfaction and Policy Approval in State Dcpartmcnt~Con 
greuional Relations * American Journal of Sociology 67, 278-283 (1961) 


71 BASIC PRINCIPLES OF COUNTING / 153 



Each of the principles CP 1 and CP 2 may be extended to any finite num- 
ber of se’s The general forms of these counting principles are given below 


CPI If A i, A 2i , A r are a finite number of mutually disjoint sets, then 

n(A x \jA 2 v UA r ) = n(A i) + n(i4 a ) + + n(A r ) ► 

CP 2 If there are n i ways to perform task T u if, no matter how task T\ 
is performed, there are then n% ways of performing task T 2i if, no matter how 
tasks T i and T 2 are performed, there are then n z ways of performing task 
T 3 , and so forth, then there are 

«i X n 2 X « 3 X X n r 

ways of performing tasks T u T 2 , , T r in the given order ^ 

PROOF OF CP 1 We have proved CP 1' for r = 2 Suppose it is true for 
the sets C ° nSlder the mutuall y disjoint sets A ,, A 2 , , A k , A h+1 Then 

B = A 1 \jA 2 \j u A k and C = A k+l 
must be disjoint (why 1 ?) Thus 

"(<■< l U u A t u A l+l ) = „(B U C) = n(B) + n(C) 

Wh.ch, from the tnduct.ve assumpt.on, may be rewritten as 
«Mt) + it(^,)+ n(At+{) 

"< fl > '~n( cT ^ 

PROBLEMS 

1 Establish CP 2’, the eenerat form of CP 2 


Example 4 There are t c/v» . 

may be ranked in the final standmj^For" ^ B ' 8 Ten footba11 teamS 
or as the outcome or pcrfcrmine fn ,,«v' a " y SUCh rant;in 8 ma y be 'bought 
oach place the standings The first list’ T ' choosln 8 a team t0 °ecupy 
can be accomplished in 10 wavs Keen . 1 ,' choosm 8 a tra m for first place, 
"tain nine possible choices for see a rC , eSS ° r lhc lc am chosen, there re- 
n-ntber of possible %£££?££?** f ° r ‘ h ' rd ’ etC 71,6 10tal 


X 2 X I = 3,628,800 


the number or poSlTwav's'ofVna' 1 ' 11 ° r Examp,e 4 . >' 'S easily seen that 
ofn different objects is the product I ' 8 "' E ' ° rdenn 8’ or ranking a collection 


irdcnng, or ranking a collection 


n(n - !)(„ - 2 ) (2)(1) 

1M ' CO '’ a 'NATOn.AL ANALYSIS 



of the integers 1 through n For shorthand, this number is written and 
called n factorial For instance, 

4f = 4 3 2 1 * 24 
71 = 7 6 5 4 3 2 1 = 5040 
10f = 10 9 2 1 = 3,628,800 

Arrangements, or orderings, of a collection of objects are often called 
permutations of those objects We have seen that is the number of permu- 
tations of n objects 

PROBLEMS 

2 Compute 3’, 8 r , and II 1 ( Hint 8 f = 8 7\ W ~ l\ 10', and, m 
general (n + 1)* = (n + l)rc' ) 

3 Write out the 4 f (=24) permutations of the numbers 1, 2, 3, and 4 

4 Compute 6 >/(3 >)(3 '), 1 2 »/(8 ')(4 ’), and 9 '/7 » 

Example 5 Of the 3* (==6) ways of arranging the numbers 1, 2, and 3, 
only two, namely 231 and 312, satisfy the restriction that no number is in 
its “proper” place We could count the number of possible arrangements by 
considering the job of filling three positions subject to the prescribed condi 
tions For the first position there are two available choices (2 or 3) since 1 
cannot be used Regardless of which is chosen, there is only one choice for 
second position (for example, if 2 is chosen first, then 3 must come next 
since it cannot be placed third) and then one choice for the remaining posi- 
tion Altogether, there are 2 1 1=2 arrangements ^ 

A convenient graphical method for analyzing the problem in Example 5 
is indicated by the tree diagram shown m Figure 1 The two branches ema- 
nating from the topmost point correspond to the choices available for the 

figure 1 


3 


I 

a h 


71 BASIC PRINCIPLES OF COUNTING / 155 




first position If 2 is chosen first, only 3 is available next, as indicated by the 
single branch leading downward from the top branch labeled 2, and so 
forth The two endpoints a and b correspond to the two possible arrange- 
ments Similarly, the tree diagram of Figure 2 shows that there are nine 
ways of arranging the numbers 1, 2, 3, and 4 in such a way that no number 
appears in its proper position 


FIGURE 2 



PROBLEMS 

5 A direct application of CP 2 is not possible in the tree of Figure 2 since, 
for example it makes a difference, in the count for position two, whether 
2 or 3 is placed in position one Try to devise a method for counting 
which employs both CP 1 and CP 2 {Hint First partition the set of 
all allowable arrangements into the union of the set A, of those arrange- 
ments beginning with 2, and the set B, of those arrangements beginning 
with either 3 or 4 ) 


Example 6 Caplow* has developed a theory or coalition formation in a 
triad based on the following assumptions 

^ ™ mem bers may differ in strength A stronger member can and will 
seek to control a weaker member 

<2) SS tF ! memb£ y seeks to contro1 others Control over two is 
none C ° conlro over one which in turn is preferred to control over 

^ ^ the^strenpth^^r 6 / ^ at tbe strengtl1 a coalition is the sum of 
the strengths of its two members 

(4> co h e e rc,on a bv r a e T lltl0n C ° nd, ' ,0n eVery tnad - “ thc «=“= that any 
nonadvamL E< f member force a ™*ker member to join a 
~ W "‘ Pr ° VOke the formation of an advan- 

tageous coalition (,f one exists) to oppose the coercion 

489—493 ( 1956 ^ Theory of Coalitions in the Triad,’ American Sociological Reueiv 21, 


™ / COMBltJATORIAL ANALYSIS 



mg that in any pairing of two subsets of objects, the subject wili prefer 
that subset containing the preferred object, how can you identify the 
preferred object by requiring only three subject judgments 9 (Compare 
with Problem 15, Section 3 5) 

13. A physician has a diagnostic questionnaire consisting of 10 “yes” or 
“no” questions How many diagnostic categories can be identified with 
this instrument 9 

14 A model of voting behavior assumes that 10% of those voting Republican 
in one election will vote Democratic in the next election and 20% of 
those voting Democratic in one election will vote Republican in the 
next election The model further assumes that the total number of voters 
is constant at 60 milbon for each election and that 40% of the popula 
tion initially vote Republican 

(a) Draw a tree to represent the possibilities for four elections 

(b) Compute the number of Democratic and Republican voters m each 
of the four elections 

15 In Problem 14, what proportion of the population would eventually vote 
Republican if the tree were extended indefinitely 9 {Hint Compare 
with Problem 20, Section 6 2) 


7 2 COUNTING NUMBERS OF SAMPLES 

The process of choosing a collection of elements from a given set is called 
sampling If r elements are chosen from a set containing n elements, we ob- 
tain an r-sample from an n set For concreteness, we often visualize the set 
as a box containing n objects distinguished either by numbers or colors 
Sampling then becomes the job of filling r positions with objects from the box 

The number of distinct samples which may be obtained depends on the 
criteria by which samples are judged to be different — specifically, on whether 
the order in which objects are drawn is important, and on whether objects 
are replaced after being drawn 

Example 1 From the 3-sct A = {a, b, c}, the following distinct 2 samples 
may be drawn 

(a) If order is important and if the first sample object is replaced, wc can 
obtain one of the nine different samples (a, a), (a, b), (a, c ), (b, a), 
(b, b ), (6, c), (c, a), (c, b ), or (c, c) Our samples in this case arc or- 
dered pairs of dements of A and collectively comprise the Cartesian 
product A X A 

(b) If order is important but the first object is not replaced, uc can obtain 
an> one of (a, b), (a, c), (6, a), (6, c), (c, a), or (c, b) Again, wc obtain 
ordered pairs, but pairs in which both elements arc the same arc not 
allowed 


72 COUNTING NUMBERS OF SAMPLES / 153 



building whose shape is shown in Figure 4 If we assume that, except for 
the dead ends 4 and 9 he never retraces a route, how many possible 
routes are there to the agent’s office? (Hint Construct a tree diagram 
using the corridor numbers shown in the diagram Each path must end 
in a choice of either 5 or 6 ) 


FIGURE 4 



158 


8 From the tree constructed in Problem 7, answer the following questions 

(a) How many routes require only two choices? 

(b) How many routes require only three choices? 

(c) How many routes require only four choices? 

(d) How many routes require five or more choices? 

9 Two Judees independently rank in order the quality of a representative 

thT/fh ° ^ Ve teclln,ca * education students’ work Assuming 

“? e first J u dge is 1, 2, 3, 4 and 5, how many possible 
disagreement ? C ^ ^ ran * C,ng secon d judge to be in complete 

result; 0 , ^ em ^° W m3n ^ P° ss, ^ e ways of ranking for the second judge 

(a) exactly one disagreement? 

(b) exactly two disagreements? 

(c) at most two disagreements? 

(d) three or more disagreements? 

U SenUaUlm d ' aer r am “ represen * a11 Posable outcomes tor four 
2™??°- ,f We assa ™ 'Hot m each elect, on, only the two 
major Parties are represented How many posstb, ln.es result m 
taj no party change? 

(b) exactly one party change? 

(c) at most three party changes? 

(d) not more than two party changes? 

a SC1 of 13 objects, a subject prefers one 
" a “ °‘ h “ s and 15 mdtflerent to the remammg objects Assum- 


/ COMBINATORIAL ANALYSIS 



PROBLEMS 

3. Compute if and (n) T for n — 10 and r — 3 

4 Compare the above general results with the answers obtained for cases 
(a) and (b) in Example 1 and Problems 1 and 2 


Example 2 (a) An inspector for the Widget Company draws a sample of 
10 parts from a lot of 30 manufactured parts If the sample is drawn with 
replacement, there are 30 10 , or about 600 trillion possibilities, while if 
sampling is without replacement, there are 30(29)(28) (21) or roughly 

100 trillion possible samples 

(b) Ten prizes of differing values are to be distributed among 30 people 
by drawing names from a hat There are 30 10 ways in which prizes may be 
awarded if names drawn are replaced and (30) j 0 ways if names are not 
replaced ^ 


When r ~ n t the number (n) r becomes if and each ordered sample drawn 
without replacement corresponds to a permutation of the n objects For this 
reason the problem of counting ordered r samples is sometimes phrased as a 
problem of counting r permutations 

Unordered samples drawn with replacement, case (c), are of relatively 
little importance for our purposes and are not considered further The num- 
ber of unordered samples which may be drawn without replacement, case (d), 
can be found by taking a fresh look at case (b) We obtained the number 
00 r = n(n — 1) (it — r -f 1) by choosing objects one at a time in order 
to fill the r available positions But any particular arrangement which can 
be obtained in this way consists of a certain set (unordered sample) of r 
objects arranged in a certain order Hence any such arrangement can be 
obtained by first choosing the unordered sample and then arranging the 
chosen objects Using CP 2, this implies that 


( , r , ,\ /number of ways\ 

number of unordered \ / r \ 

samples of r objects ) X l °. f arran f g , j 
r \r chosen objects/ 


Since the number of arrangements (permutations) of r objects is r\ the num- 
ber of unordered samples of r objects drawn without replacement from a set 
of « objects is 

(n) r 

r< 


For convenience, this number will be denoted by 



Unordered samples drawn without replacement are also called combinations , 
and problems of counting samples are called problems in combinatorial 
analysis Hence the name of this chapter 


72 COUNTING NUMBERS OF SAMPLES / 161 



(c) If order is not important but the first object is replaced, we can obtain 
one of {a, a}, {a, b}, {a, c}, {b, b }, {b, c), or {c, c} Here the samples 
are sets of objects Disregarding order means, for instance, that {a, b} 
and {b, a} are regarded as identical 

(d) If order is not important and the first object is not replaced, we obtain 
one of { a , 6}, { 0 , c}, or {b, c} Again the samples are 2 sets ► 


PROBLEMS 

1 For each of the cases indicated in Example 1 , find 

(a) the number of 2 samples from a 4 set 

(b) the number of 3 samples from a 4 set 

2 Draw a tree diagram for cases (a) and (b) of Example 1 in each part of 
Problem 1 


When the order in which objects are drawn is important, the job of count- 
ing the number of r samples from an n set is equivalent to counting the num- 
ber of ways of successively filling r positions with objects drawn from a box 
contams n ob J ects (Figure 5) If the objects are replaced, 
whinii ii, T arC U WayS eacb position, regardless of the ways in 

S P0 ^ ltl0ns ar , e filIed Applyin S CP 2 '> 11 follows that there are 
replacement *** *** ^ sam P Ies which may be obtained when sampling with 


FIGURE 5 



position numbers 


If objects are not replaced cav /m 

Position after which smczLllt * ^ n ways of filhn S the fost 
(n — 1) ways of fillmo bject 1S now unavai lable, there are only 

etc Hence ftere “ 8 P ° S1 “° n ‘ W °’ ,hen <" ~ 2) ways for positron three, 

different ~ ~ ( n — r + 1 ) ( 2 ) 

replacement For shorthand be Stained when sampling without 

shorthand, the product (2) ,s usually written („), 



product is (ri) r ) 


^position number 

160 / combinatorial 


ANALYSIS 



the elements not chosen, and thus the number of /“-subsets is equal to the 
number of (n — r)-subsets 


Example 3 The number of distinguishable arrangements of r red balls and 
n — r white balls is 


C) 


This is because each distinct arrangement is determined by choosing from 
the n available positions the r positions in which the r red balls are to be 
placed ^ 


problems 
10 List the 


©■ 


distinguishable arrangements of two letters a and three letters b 
11. Argue that the count 

Q 

would apply to any problem of arranging n objects, r of one kind and 
n — r of another 


Counting the number of /--subsets of an /i-set is equivalent to counting the 
number of ways of partitioning an n- set into two subsets, the first containing 
r objects and the second n — r By applying this argument k successive 
times, we may count the number of ways of partitioning an //-set into k sub- 
sets, the first containing n x objects, the second it 2 , etc, where, of course, 
fti + fla + ** + «*. = « 

The iti objects for the first subset may be chosen in (JJ,) ways Regardless 
of which objects are chosen, there remain n — n x objects from which we 
tnay choose « 2 objects for the second subset in 

CD 

ways In general, for the jth subset, there are n — n x — n 2 — * — 

available objects from which n, objects may be chosen in 

*ays Hence the total number of ways of performing the required task is 


72 COUNTING NUMBERS OF SAMPLES / !S3 



PROBLEMS 


5 Bl sure that it is clear to you why the above argument obeys the counting 
criterion (CC) 


6 Compute 


0 


for n — 10, r ~ 3 and for n — 5, r = 2 
7 Compute 


©■ ©• - 0 


8 Compare our general results with case (d) in Example 1 and Problems 
1 and 2 


9 Compare 


0 - 


EXa ,T? le J 2 ' 1112 ™mber of different unordered samples of size 10 which 
could be drawn without replacement from a lot of 30 parts is 


fiO\_ 00X29) an 

\10/ (io)(9) (iy~ or about 30 million 


The number 


0 


can be written entirely in terms of factorials Since 

" ( " )( " ~ 1) (”-<•+ D(n - r)(n -r- 1) (1) 

= (n)r (n - r) ' 

we have 


(") = (Sk- i,l 


r)' 

The form of (3) makes it apparent that 


(3) 


and 


0 


C-r) 

me idem, cal For a counting interpretation of this fact, we note that 

0 

“ may * *«“ n given „ set 

corresponds a unique („ - r ) subset consisting of 

162 1 comb 'natorial analysis 



the elements no, chosen, and thus the number of r-subsets ts equal to the 
number of (n — r) -sub sets 


Example 3 The number of distinguishable arrangements of r red balls and 
n — r white balls is 



This is because each distinct arrangement is determined by choosing from 
the i, available positions the r positions in which the r red balls ^ 

placed 


PROBLEMS 

10. List the _ . 

©-» 

distinguishable arrangements of two letters a and three letters b 


11. Argue that the count 


0 


would apply to any problem of arranging n objects, r of one kind and 
n — r of another 


, u d.hcets of an /i-set is equivalent to counting the 

Counting the number of '•-subsets ^ wq su ^ s£ts> the first c ontammg 

number of ways of partitioning a this argumc nt k successive 

r objects and the * e “ n ^ b J of * a ys of part, tion.ng an n-set into k sub- 
times, we may count the num °“ y sec ond etc , whe re, of course, 
sets, the first containing n t objects, tne sec 

" " 2 + J , + r ?,Z first subset may be chosen in O ways Regardless 

The objects for the remain h - «i objects from which we 

of which objects are chosen, there remain n 
may choose n 2 objects for the secon su se 

cvo 

ways In general, for Ihe ,.h subset, there are 
available objects from which n, objects may 

/n — ni — - ' — n r-'\ 

L n, > 


ways Hence the total number of ways of performing ‘-required .ask IS 
/n - ni\ /n - "i - " 2 ) . . (^' ~ , f j " k ~ 


or roc 


n 3 


-) 


7 ji COUNTING NUM 


•MPLES / 163 



Using the factorial form (3), this may be simplified to 


which we denote by 


( " ) 

\n i, n 2 , , n k ) 


PROBLEMS 

12 Write out the 




ways of partitioning a set of four objects into three subsets, the first con 
taming two objects and the other two, one each 

13 Go through the details of the simplification indicated above 

Example 4 A bridge hand is a subset of 13 cards chosen from a standard 
deck of 52 cards There are 

© 

or about 629 billion, such subsets 

Dealing the cards to the four players around a bridge table amounts to 
partitioning the set of 52 cards into four subsets of 13 cards each There are 

52* 


13»13'13'l3i 

or about 437 X 10 s0 (437 followed by 26 zeros) different possible deals ► 

ment'or'd * n a pubbc opinion poll, a subject is asked to express agree- 

” n m tfr e u Ch ° f 10 ,tems Th = A of response pat- 

terns in which he agrees w.th exactly k items contains 


© 


H™ce nt th= 0 n n u m b r P0 M blC Ch ° ,a: ° f k Items of *■= 10 Panted 

" "h at 1st f , P resp °" se pauerns “ wh '0h the subject agrees 

wun at most three items is, using CP 1 # , 

u A 1 U A, u A 3 ) = „(/!„) + n( A ,) + „(^ 2 ) + „(^ a) 


-CMWCM 

1 1 + 10 -f 45 + 120 = 176 


164 / COMBINATORIAL ANALYSIS 



Example 6 Consider a committee whose members successively vote in 
favor of some bill As soon as a majority of the members have voted for the 
bill it is declared passed The member whose vote provided the majority is 
called pivotal and is given credit for passage of the bill 
Shapley and Shubik* have proposed the proportion of possible voting 
orders in which an individual is pivotal as a measure of his voting power 
For instance, in a committee of three in which the chairman has two votes 
and the other members one each, the chairman’s power is § and the power 
of each member £ For, if a majority is required for passage, the chairman 
must be included in every winning coalition and is pivotal if he votes either 
second or third Thus in the six possible voting orders 

ABC, ACB, BAC, BCA, CAB, and CBA 


the chairman C is pivotal in the first four, member A is pivotal in the fifth, 
and B is pivotal in the last 

As a rationale for the method, we may assume that members are lined up 
in support of a bill m order of their intensity of feeling for the bill In this 
ordering the member occupying the pivotal position is crucial to the forma- 
tion of a winning coalition If we assume that over the spectrum of bills 
which might be introduced, each ordering is as likely to occur as any other, 
the Shapley-Shubik measure provides a realistic indicator of power 
As a more complex illustration of the method, Shapley and Shubik con- 
sidered a tricameral legislature in which the first house contained member A 
only, the second contained three members B t , B 2 , and B3 and the third, 
five members C u C 2 , C 2i C 4 , and C 5 A winning coalition must contain A, 
at least two B % and at least three C, Thus, for example, in the ordering 

Bj B% Ci C 2 A B2 C3 C5 


B 2 , A, and C t are pivotal within their respective houses (as indicated by 
the dots), while C\ (third dot) is pivotal for passage of the bill 
Member C\ is pivotal if he is preceded in the ordering by A, exactly two 
other Cj and either two or three B t Thus the number of orderings in which 
he is pivotal in position six is computed by choosing two C } (there are (2) = 6 
ways), two B » (3 ways), and A (I way), arranging the five chosen members in 
the first five positions (5 f ways), and the other three members in the last three 
positions (3* ways) There are a total of 6 3 I 5 f 3 f such orderings 
Similarly, there are 6 1 l 6' 2* ways in which Cj is pivotal in position 
se'en Since there arc 9* possible arrangements of the nine legislators, the 
Power of C x is 


Wi) 


6 3 5' 3' + 6 6' 2' 5 

9» 84 


► 


‘Staple) L S and Shubik M A Method for Evaluating the Distribution of Power 
in a Committee S>stem, * American Political Science Rene# 48, 7S7-792 (1954) 


72 COUNTING NUMBERS OF SAMPLES / 1C5 



Relatively few counting problems can be solved by a simple application of 
one of the formulas developed above Aside from use of the numbers 



we strongly recommend that all counting problems be analyzed by direct 
use of the basic principles CP 1 and CP 2, taking the discussion in this section 
as a guide 


PROBLEMS 


14. An electric utility company sends teams of three men on emergency re- 
pair jobs If the company has 10 qualified repairmen, how many different 
teams can be formed ’ 

15. Five candidates for branch manager in a food store chain are ranked ac- 
cording to their weighted average score on a number of variables thought 
to be relevant to job success 

(a) How many ways can this be done if no two candidates receive the 
same rank’ 

(b) How many ways can the candidates be ranked if no two receive the 
same rank and candidate A is always ranked above candidate fi 9 

16. A sample of five transistors is selected from a lot of 20 The sampling 
Plan is such that the entire lot will be rejected if three or more defective 
transistors are found in the sample 

(a) How many different unordered samples could be drawn’ 

(b) If seven of the 20 transistors are defective, what percent of the sam 
pies would lead to rejection’ 

17 one^defective ,he dec,s ‘°n strategy is as follows If zero or 

Tre umrs Te rl aT*,'" ,he sam P Ie ’ lhe « accepted If three or 
units are defert ° Un c ^ccti\e, the lot n rejected If exactly two sample 
remaining 15 anAih s . econd sample ul three units is selected from the 
sample are not def f ° T l S accepted lf at least two umts in the second 
percent of the no \,] VC P seve n transistors in the lot are defective, what 
percent of the possible samples leads to rejection ’ 

18, Analogous to Example 3, show that there are 


«1 '"2' 

distinguishable arrangements nf n „ 

some other mterpretauon! of ,h[s resuT 5 ' lW ° S ' ' * *' S ° ,W 

neutral towards anton*' a "° W * he poss,blllt >' lhat th = subject may be 

disagrcc^wi thrive ueme rCS 'i°'’ d so lhat he agrees with four items, 
<bl How ' °" d 15 neUtral ,owatd th = remaining ,tem’ 

w How many ways can he answer so it,.,, 

..em and disagrees wnh a. leas, C °J °" e 


I6S / 


combinatorial analysis 



20. In a four-man committee, members A, B, C, and D have 3, 2, 1, and 1 
votes, respectively Simple majority wins Compute the Shapley-Shubik 
power rating (see Example 6) for each member 

21. In Example 6, for reasons of symmetry, the power of each of the mem- 
bers C 2 , C 3 , C 4 , and C 5 is also 5/84 Show that the power of A is 8/21 
and the power of each of B\, B2, and B3 is 3/28 Thus, assuming that 
the power of a house is the sum of the powers of its members, the three 
houses possess power in inverse relation to their size 1 

22. Shapley and Shubik (refer to Example 6) have analyzed the power of 
the “Big Five** in the U N Security Council The council consists of 
11 members, five of whom have veto power Passage of a measure re- 
quires seven affirmative votes and no vetoes Verify the Shapley-Shubik 
calculations which give combined power 76/77 to the Big Five and 1/77 
to the other six members 

23. (a) In Example 3, Section 3 4, how many I scales would be obtained 

from a / scale having three stimuli 7 four stimuli 7 five stimuli 7 
(b) Develop a general formula for the number of / scales obtainable 
from a J scale of n stimuli 


7.3 BINOMIAL COEFFICIENTS 

In addition to their use in counting, the numbers 

0 

also appear as coefficients in binomial expansions For instance, (a + y) 2 = 
x + 2 xy -f- y 2 may be written as 

(* + »*- @ + + 

(v + y) 3 = t 3 + 3x 2 y + 3xy 2 + y 3 


and 


<* + y) 3 - © ' V + (i) «V + 0 vV + ( 3 ) 

These results arc special cases of the following theorem 
Binomial Expansion Theorem If n is a positive integer, then 

( < + - © 'V + © '"-V +••+(„-,) o-- + © 

- ± 0 v-V rn 

r~Q ' r/ 


73 BINOMIAL COEFFICIENTS / 107 



PROOF To see why the numbers 


0 

appear in binomial expansions, let us first consider the case when n = 2 
Applying the distributive law twice to (x + y) 2 — (x + >»)(x + y) S ives 

(x + y)(x + y) « (x + y) x + (x + y) y 

— x x + >» x-fx y + y y 

Each term in the sum is the product of two quantities, each of which is either 
x or y Closer inspection shows, that in each product, the first number came 
from the first (x + y) factor and the second number from the second (x + y) 
factor The final result consists of the sum of all possible products which may 
be obtained in this way 

Similarly, (x + y) n is the product of n different (x + y) factors A term 
in the expansion is obtained by choosing either x or y from each factor A 
term of the form x" T y r is uniquely determined by the r factors from which 
y is selected Since there are 

C) 

ways in which r factors may be chosen, the term 



appears in the expansion of (x + y) T ‘ 


PROBLEMS 

1 Write out the binomial expansions of (1 + ,)* and (s 2 - 4 i) 3 
Example 1 p ut „ ng x = , = i ln (4) yields 

so -0 + 0 + +0 

may be chosen from^fn^etS 0 ** ^ nUmbcr ° r subscts of slze r which 

„ o 

Hence, the total number of subsets of all sues is 

© + Q + + Q = 2" ► 

EXamP,a2 , ^-,x- I a„ d ^_ ll n (4)wcobtam 

“■'i-^-SQwr 

- Q~(i) + Q- +c-i)"Q 

/ COMBINATORIAL ANALYSIS 



Adding this equation to that of the preceding example gives 


T + 0 




[1 + (-l) r ] 


Since (- 1)' is 1 when r is even and - 1 when r is odd, the two sums com- 
bined yield twice the sum of those numbers 

Q 

for which r is even Hence, 

Example 3 We have already seen (Problem 8, Section 5 4) that if a prin- 
cipal P is invested at interest rate i compoun e m imts P > 

mulated amount after n compounding peno s 1S " . , com . 

the worth of $100 two and one-half years after being invested at 4% com 

pounded quarterly is 

,4,0 = $100(1 +0 01) 10 

By writing out four terms of the binomial expansion we obtain the approx.- 
mate value 

j„ . «,»[. + (“) ® 0 .) + (?) 9«f + (',“) (»»')•] 

= $100 [1 + 0 1 + 0 0045 + 0 00012] 

= $110 46 

which, as the reader may verify, is correct to the nearest penny ► 


PROBLEMS 

2. Find, to the nearest $10. the accumulated amount of S1000 invested for 
20 years at 5% compounded annually 

Example 4 Suppose o, b, and k are posittve integers with a > k and b > k 
Then 

(":*)- (y CM," ,) (?) + + (?) (* - ■) ' + ® ® 

„ „ a_ h halls of which a arc apricot 
To see this, consider a box containing _ m rcprc scnts the num- 

colored and b arc brown The binomial , hls a particular 

ber of dilTcrcnt i-subsets which may be chosen f 

73 BINOMIAL coefficients / 1.S 



subset may contain any number of brown balls between zero and k, in- 
clusive The number of subsets containing exactly r brown balls is G- r ) ( r ) 
since we may choose any r brown balls from the b available balls, and then 
the remaining k — r balls from the a apricot balls Since the total number 
of /:-subsets is equal to 


/ number of subsets \ / num be r containmgX / number containmg\ 

I contain, ng no brown I + ( one brown ball ) + +1 k brown baUs ) 


we have established the desired formula 


► 


The definition 

G) - 7T 

for the binomial coefficients was made only for those cases when r lies be- 
tween zero and n However, it is convenient to extend this definition to other 
values as follows 


0-»- 

for r > 0 

I! 

o 

for r < 0 

problems 


3. Prove that 


00 r 

and ( n \ 

are both zero if r > n 

W 


ev» if d ?“ aS the advanta 8= of making Equat.cn (5) val.d 

number uf acr.™, ^, ° V “ ly ' ,hcre 15 w»y of choosmg a ieater 

box You should ch v °u f ° Wn balIs ’ than are ori S lna Hy contained in the 

^s^ 2 r„ d ,fe:fcat SUmmatl °" (5) ,0 « ,ha ‘ the 


PROBLEMS 

4 Use th u definition (6) to prove that 


(7) 


r <*q Values ° fn - 1 and for a ny r Be sure to include those cases where 


When r is a positive 
interpretation The 


integer not exceeding n. Formula 


(7) has a counting 


(”) r-subsets 


170 / comb 


inatorial analysi 



of an ,,-set can be classified according to whether or not they contain a specific 
element. The number of subsets containing the specified element 


(;: 0 


tain the specified element is 


(" 7 ') 


for in this case we must choose the entire subset from the other ,, 1 

elements. 

Example 5 The property of binonual coefficients expres^d inProbto4 
can be used to generate all these coefficients starting only from the 


©- 1 and C) = l 


jstarKS 

to the left (row n - 1, column r ) be extended indefinitely, 

only eight rows are shown, the table can, ’ ^ 



The method used to demo the binomial expansion theorem may be ex- 
tended to the following result. 

7 .3 BINOMIAL COEFFICIENTS / 17t 



Multinomial Expansion Theorem Let n be a positive integer and Xi, 
x 2 , . . , Xi any real numbers. Then 

(x, + x 2 + ■ ■ • + * 0 " = y , („ „ " „ ) x "' x ¥ 

V»l> «2 j • • • > »*/ 

where the indicated summation is taken over all sets of non-negative in- 
tegers («i, n 2 , . . . , /»0 for which nj -f «2 + ’ • * + "l = a- 

PROOF In order to obtain the product (xi + x 2 + * • ■ -f- xjl)”, we must 
choose from each factor one of the numbers xj, x 2 , . . . , x n , multiply these 
choices together, and add all possible products which can be obtained in this 
way. Let us imagine that each factor is placed into one of k groups depending 
on which of the numbers Xj, x 2> . . . , x* is chosen from that factor. The 
number of ways of partitioning the n factors so that exactly Hi are placed in 
the Xj group, n 2 in the x 2 group, . . , ni in the x^, group is 

( " ) 

Hence this number represents the coefficient of xT’xS 2 • • • x** in the ex- 
pansion 


Example 6 A collection of k non-negative integers (»„n, m) is 

called a k-panmon of the integer n if n, + „ 2 -| |- There are 

fi n i 3 m a n U n"n 0 n'e e o' n ‘ e8er 3_nanK| y. 0,0,0), (0,3,0), (0,0,3), 
(2, 1, 0), (2, 0, 1), (1, 2, 0), (1, 0, 2), (0, 2, 1), (0, 1, 2), and (1 1, l>-and, 

FoTmsta 8 y ’ *° temS “ thc a P ms ' on °P any cube of a trinomial. 


( 3 , 0 , 0 ) x 3 y 0 r° + 

*"A° + ( 

' 3 > 
,0, 0, 3 / 

|*°A 

+ ( 2 ,!o)*V+( 2 j, 

,)xVz + 

( 3 
\lj 2, 0 

)xyV 

+ (l,0,2)^ + ( 0 j > 

.)*v*+ 

( 3 
\0, 1, 2 

) xV 

+ (l,l, l)^ 2 





x 3 + / + a 3 + 3 x 2 y + 3x 2 z + 3xy 2 + 3x Z 2 + 3/x 
+ 3 yz 2 -f 6xyz 


tent, otT that thcTotal number o^channeh' of 2 ^ Umcnt to su PP° rt the c °“- 

•“r : at icast prop ° r,,o,,atdy -tht^zTe^rrze 2 ' 1 ° r8amza ‘ 


172 / COM 


INATOR IAL ANALYSIS 



In a group of n members, the number of pairw.se communication chan- 
nels is given by the binomial coefficient 

n(n — 1) 


© 


(8) 


Suppose that the group is increased c- fold where e >s an integer (c > 1) 
Then the total number of communication channels become 


Dividing (9) by (8) we obtain 

Q 

©" 


cn (cn — 1) 
2 


eV - «/ c ) 
— _ „) 


(9) 


But n/c < ii since c > 1 and hence. 


(n 2 - n/c) > , 
'(n 2 - «) 


Therefore, 



► 


Example 8 Davies and Davies* have suggested “ 5 -f "s of h 
Ehthorn Perceptual Maze Test based on t„c. 

maze A typical test item is shown in Figure « J.^ ^poinUare both 
a path from the origin (0, 0) to the upper diagonal (whose e P 

labeled 7) He ,s allowed to pass ^ ‘path'whmh passes 

mediately above or to the right 8 do(s , n Flgure 6j three is the 

through the greatest possible number g 

greatest number of large dots lying on •‘singe ^ points y f f or 
In an arbitrary maze the *th d.ag ^ rea(;hed from (0 , 0) by taking 

which x + y - k A particular point ( ,y) the ordcr ln w h, c h the 

x steps to the right (R) and y step 5 upw djW ^ w 1S JU st the 
steps are taken is immaterial, the total . 

number of ways of arranging x R’s and y s. 


ft') 


Die total number of paths which terminate ion the 
the determination of any such path is equivalent to filling _ 

•Danes M G and Danes. Ann D M." «<* “■* mc 

2, 371-3SO (1965). 

73 binomial COEFFICIENTS / 173 



FIGURE 6 


0 1 2 3 4 5 6 7 

with one of the two symbols, R or U Symbolically, 




*+v-fc 

as in Example 1 

Let us denote by u m (x, j>) the number of paths terminating at (x, y) which 
. rough exac ^y m large dots Since any such path contains (x + y) 
steps, it can pass through no more than x + y points Hence 

”o ( *’ y) + •“<*’ ?> + + W*. y) = if u m (x, y) 

is equal to the total number of paths 

('*,') 

-■assriiiafttsa-:--. - 


while 


1=0 


Wo(fc) + £/,(*:) + + Ut(k) _ £ U m (k) 


diagonal Thus, for rn- 

in Table 2 8 rc 6, we have the results summarized 

The authors argue that a subject shm.i a . 
the difficulty of the maze and thp r> , eceive a sc °re which reflects both 
formula suggested is om Pleteness of his solution The scoring 

logio M 
U m (D) 

174 ^ combinatorial analysis 



where « is the number of dots on the path selected S 
flth, and U S W is i*e largest of^the numters )- ^ J ^ _ fi 

?hus U-T?) = 59 anTa sublet iho selected a path traversing two dots 
would receive a larger score (logic (59/49) = 0^8)^ than a subject whose 
selected path traversed only one dot (log, 0 (59/59) >■ 


TABLE 2. Total number of paths to the third diagonal in F.gure 6 



PROBLEMS 

5. Use the binomial theorem for the following computations 

(a) Fourth term of (o + r) 10 

(b) Tenth term of [a- - (1/2*)) 14 

(c) Seventh term of lx 3 + (t 4 /*)1'° 

(d) Term containing b° of (n + *)' 

(e) Term containing .v 12 of [* 4 + (j— /2» 

(0 (1 01) 8 to four decimal places 

(g) (0 98) 12 to four decimal places 

(h) [(* + >) — 11°, complete expansion 

(0 [2 y — (1/>)1 S , complete expansion 

0) [(x/r 2 ) - term containing x" 

6. Extend the Pascal Triangle (Table 1) to « = 12 - 

7. Using a form similar to that in Problem 4, argue that 


/ n + 1 

VI 1, HJ 


) 

-( 


Ml “ 


3 ) + («..«- 1 .™) + 

binomial coefficients / 175 


7 3 



8 Prove the Binomial Expansion Theorem by mathematical induction 
( Hint Use Formula (7) ) 

9 Suppose that m Example 8, the maze contains only a single dot located 
at (a, b) Show that the number of paths which pass through the dot and 
terminate at the Dth diagonal is 


2 D-<a+ 


10 Under the conditions of the preceding problem, develop general formulas 
for ui(jt, y) and uo(x, y) 

11 Prove that 

12 Prove that for any positive integer n, 

{Hint Divide out the common factor A, factor n out of the summation, 
and recognize what is left as the binomial expansion of (I + l) 1 *" 1 ) 

13 Prove that if n is a positive integer, 

(i) - 2 G ) + 3 Q - + (- o”- 1 « Q - o 

14 Prove that if n is a positive integer, then 

Write k 2 = *(* - 1 ) + k and p roccl;d as ln ProbIc[ „ j 2 ) 

15 S+f+ C °) e . m o C '“' ° f ‘ he " rm ‘" TOlv '"S m the expansion of 

16 Write the complete expansion of [2x 2 -x + (]/*)] 3 

SUPPLEMENTARY READING 

0 N^fgW^ChtpS’ A " Imroduc "°'' (Prent.ce Hall, Inc , Englewood Cliffs, 
York, 1966)/chaptcr T '° Ma,h ‘ ma " cs (Harcourt, Brace & World, Inc , New 
Inc .New YorMWIj^t™; 2°^ Appl ‘ caUons (John Wiley & Sons, 


/ Combinatorial analysis 



8 Prove the Binomial Expansion Theorem by mathematical induction 
(Hint Use Formula (7) ) 

9. Suppose that in Example 8, the maze contains only a single dot located 
at (a, b ) Show that the number of paths which pass through the dot and 
terminate at the Z)th diagonal is 


10. Under the conditions of the preceding problem, develop general formulas 
for «i(x, y) and «o(x, y) 


12. Prove that for any positive integer n. 


e integer n. 


(Hint Divide out the common factor k t factor n out of the summation, 
and recognize what is left as the binomial expansion of (1 + l)"- 1 ) 

13. Prove that if n is a positive integer, 

(l) “ 2 G) + 3 (") - + C-D”' 1 n (“) - 0 

14 Prove that if n is a positive integer, then 

S* 2 C) '»("+■ )2”- 2 

(Hm Write k 2 - k(k - 1) + k and proceed as in Problem 12 ) 

(x -t- y e + C “m‘ enl ° r the term mV0,VIn K x 2 y*z 2 in the expansion of 

16. Write the complete expansion of [2x 2 _ * + (1/x)] 3 
SUPPLEMENTARY READING 
N J [Isli/cSer 3 - A " , " ,roduc "°'' (Prentice-Hall. Inc , Englewood Cliffs, 

York, 1966), Chapter T Ma,h ‘ ma " cs (Harcourt, Brace & World, Inc , New 

Parzcn, E , Modern Probability Tlipn™ „ j 
I nc , New York, 1960), Chapter 2 * ' <S Ap P llcatl °™ (John Wiley & Sons, 


inatorial analysi 




r » c, then is an r-square matrix In this ease the elements a j 3# fl 2 :, tfaa, 
a rr are said to constitute the main diagonal of A 

PROBLEMS 

1. What are the dimensions of the* m uricts in (l) 7 

Examplo 1 Let us define a relation P on the set of objects {0 3 , O it O^} by 
O t PO } if and only if object O , is pre/emd to object O, We shall require that 
for each pair of different objects one is preferred to the other and that no 
object is preferred to itself 

Any particular preference structure can be indicated by writing a 3 X 3 
preference matrix R <= (r tJ ) in winch the (;,j) entry r tJ is 1 if object O, is 
preferred to object O, and is 0 if no such preference exists For example, the 
matrix 

O x 0 _> o 3 

o x /o 1 0\ 

*-<M° 0 1 

0 3 \l 0 0/ 

indicates that 0 \ is preferred to 0 2 , 0 2 is preferred to 0 3 , and 0 3 is pre- 
ferred to 0 1 ► 


problems 

2 For a set of ft objects {0i, O 3 , , O n } , how many ones would appear 

in any preference matrix 9 

3. Could a preference matrix contain a row or column composed of nothing 
but zeros 9 of no zeros 9 

Two matrices A and B are said to be equal (A = B ) if they have the same 
dimensions and have the same elements m corresponding positions That is, 
A = (o,j) and B = ( b tJ ) are equal if a XJ = b l} for all choices of 1 and j 
Thus the matrices 



are equal But 



81 ADDITION OF MATRICES / 181 



addit.on and 

■plication of matrices $ 


8-1 ADDITI0N ^ MATR.CES 

c::) o ... 


/ 12 -2 7 

, 0 o 2 

\ 5 4 3 


1 4 ^ 

-16 1 
29 - 6 / 


n, « r « 0-dVr bTo - c columns ,s cn„ed an r 1 

An - \ 


/° n a 13 .. 

^ *»l ° 21 °22 0 23 

°'2 a, 3 


^dement m ,h i ' rl ° r2 0,3 o ) 

,s denoted by o o ^ pos,l, °n of ^ th ' 

u by lhc iho «hand S nota I ,mCS ’ by We of,^ ' th row and ^ lh colu 
w umnsanj^ « noutwn ✓*,*, = 7° ,cn denote a matrix suci 

Va,UC * ». . . . , "here i takes T™'?* that A has - row 

/ « 4 T„ 1CES ' - called 



For instance, 

/I -7 3\ A) 0 0\ /I + 0 -7 + 0 3 + 0\ 

\2 0 4/ \0 0 0/ \2 + 0 0 + 0 4 + 0/ 

-c: 3 

The matrix —A, each of whose clcmuils is the negative of the corre- 
sponding element of A, is the additnc inverse of A That is, 

A + (-^ 1 ) = -/) + A = O 

Thus 



7 -3\ /I - 1 -7 + 7 3 - 3\ 

0 -4/ \2 - 2 0 - 0 4 - 4/ 



In general, the sum of any number of matrices having the same dimensions 
is obtained by adding corresponding entries 


Example 3 


■c 


1 + 6+1 + 14 
: + 4 + o + o 


I + 6 - 1 - 3\ 
-7 - 4 + 1 + 2/ 


MU ► 


Example 4 Suppose in Example 1 that each of N judges lists his prefer- 
ences among the three objects Let i?,(z =1,2, , N) represent the prefer- 

ence matrix of the ith judge Then the matrix 5 whose (t,j) element (z = 1, 
3 ,j ^ 1, 2, 3) represents the number of judges who prefer object O f to ob- 
ject Oj is given by 

N 

S = J*! + R 2 + + Ry = E Rz ► 

»=i 


Matrix addition is a commutative operation, since matrix addition is 
defined in terms of the addition of the individual real number entries That 
1S » since 

OiJ + + a >j 

Jt follows that 


A + B = B + A 


81 ADDITION OF MATRICES / 183 



is not equal to A since a 2 2 =l? £ 0 =sC 22 and 



is not equal to A since the dimensions are different. 

When two matrices A and B have the same dimensions, the sum 


C = A + B 

is defined to be the matrix obtained by adding corresponding elements in 
A and B. That is, 

c„ - a tJ + b„ 

for each (i,j) pair. If A and B do not have the same dimensions, no sum is 
defined. 


Example 2 If 

A 

then 

A + B = 


= (-* 1 3 ) 
\ 0 2 \) 


and B 


1 + 3 1 + 0 3 + 2' 
0-1 2-1 4 + 6, 
Another matrix addition is 


H! 


i) 


(~ x 2 °\ / 1 9 -6 \ f° u ~ 6 \ 

.lit: :) . 

PROBLEMS 

4. Find the sums 




1 2 W° 1 7 ) 

6 —3/ \7 -1 -2/ 


5. If A + B C, where A, B, and C are matrices and A is 10 X 5, what 
must be the dimensions of B and C? 


The matrix O rXc , each of whose entries is the number 0, serves as the 
identity element for addition of rXc matrices That is 


a+o=o+a=a 


182 / MATRICES 



PROBLEMS 

7. In Example 7, show that s„ = 1 - 3,. when / * j. Of course, 3„ = 0 
for aU i. (Why?) 

The transpose A’ of a matrix A is the matrix obtained from A by inter- 
changing rows and columns. More precisely, if A ts r X 
is a c X r matrix whose ft J) element (A'),j ts the element in the 0, 0 post 

tion of A.* 


Example 8 The transpose of 



while that of 


is 






► 


It is easily seen that the transpose of the sum 
or the individual transposes. For if A and B arc 
the (f, j) position of the transpose of A + B is 


of two matrices is the sum 
both r X c, the clement in 


((^ + B)'l, = (A + B)„ = A,. + B„ 

= (A')„ + {B’)„ = (A' + !>')„ 


which ts the element in the OJ) position of A ' + B\ Sinec U af 
A' + B‘ hate the same elements in the same postlions, they must be equal. 


8.1 ADDITION 


OF MATRICES / 185 



Example 5 An example of the commutative property of matrix addition 

IS 





► 


Matrix addition is also an associative operation That is, if A, D, and C 
are three matrices having the same dimensions, then (A + B) + C — A 4* 
(B + C) This follows directly from the associative property for addition of 
numbers The proof is left as an exercise (Problem 12) 

PROBLEMS 

6 Prove that if for three matrices A, B and C we have A + B = A + C, 
then B — C (Hint Add — A to both sides ) 

The matrix —A is obtained from A by replacing each element by its nega- 
tive or, equivalently, by multiplying each element of A by — 1 We shall 
describe this operation by saying that the matrix A itself is multiplied by — 1 
More generally, the matrix obtained from A by multiplying each element of 
A by the same real number c is called the product of A by the scalar c and 
is written cA We define Ac to mean the same thing as cA 

Example 6 If 


3 


(; 

2 


and -3,4 = [ 3 

-3 

-18\ 


4 

8/ 

V 3 

-6 

-12/ 




— 1 | N 

»*“*£*■ 

toTj ,t ff£rCnt k , md °! -atru whose mdmdual elements J„ 
represent the proportton of judges who prefer object , to object; ► 


184 / MATRICES 



Prove these results and then look at computational details when a 4, 

^ = (-7-2 i) ^ B = ( 0 4 d) 

14. (a) Show that if b is a scalar and A a matrix, and if bA = O (the zero 
matrix), then either * = 0 or A - O ^ 

(b) Show that ,fM = bC and if 6 ^ 0, then/1 = C, and that if A - C, 
then bA = bC for all numbers b 


15. Prove that 


(A - B)' = A' -B' 


The .race of a square matrix /t is the sum of the entries on the mam diagonal 
and is denoted as tf A 


For example. 



i ox 

4 2 1=1+4-1=4 

l - 1 / 


Prove the following 

16. If A is a square matrix, then tr A = tr A 

17. If A and B are both n square matrices, then tr (A + B) 

18. If A is a square matrix and c is a scalar, then tr (c/0 = 


= It A + tr B 
ctr A 


8 2 MULTIPLICATION OF MATRICES 


r 1 ic called a row vector, while a one-column 

A matrix consisting of a single row 1 jn a vector are called its ele- 

matnx is called a column sector h ^ ^ dmem , on of the vector 

meats, or components The number of 2) are, respectively, five- 

For instance V\ = (2, 0, -3, 7, 6) and V, - ( ■. 4) 
and two-dimensional row vectors, while 


V3 = 



is a four-dimensional column from our practlC c of using 

Tile concept of dimension for colors stems ir ^ ^ * or m spJcc . 

pairs, or triples, of numbers 8 * 10 , dcS1 ? .,f c planc 1S found x units to the 
Thus, m Figure 1(a), the point (x,>) on the plane 


8 2 M 


ULTIPUCATION OF 


MATRICES / 137 



Example 9 If 


then 


while 


A + B - 



PROBLEMS 
8, If 


■-C5 «-Q 

"" d ^ + B, B + -1, (A + B) + C, A + (B + O, 7, A - 4 fl, (A + BY, 
A + B’, 4/1' and (4/1)' 

9. In Example 7, prove that 3 + 3' is a matrix £/ such that u„ - 0, while 
= 1 when ' * 1 Wm, See Problem 7 ) 

1°. The transpose of the transpose of a matrix A is equal to A itself That is, 
U) A Prove this and give an example 

sum of the inti tr; j 11 vi >os e of tlle sum an y number of matrices is the 
sum of the individual transposes 

12 Ind™ have The sa ad H‘"°" 15 assoc| attve That is, prove that if A, B, 
and c have the same dimensions, then 

W + B) + C = A + (B + O 

U ' proneruef stT n^'r ^ aSS0C ' a "v=' “".mutative, and distributive 

Zr ees Lv™ ,h y ' * ! and b arc numbers and if A and B are 
matrices having the same dimensions, then 

(a) aA = Aa 

(b) to + tU -°A + bA , Aa + Ab _ A{a + b) 

e) «(4 4- B) = „A „n „ Aa + Da r (A + ^ 

(d) (ftb)A - a(bA) = b(aA) = A(ab) 

(=) 1 A = A and 0 A = o (the zero matrix) 


186 / MATRICES 



FIGURE 2 



si!:ruSr»«'*3Sii;K 

is the vector (3, 1) drawn from the or.gm to the head of the vector (2, Z) 
placed so as to start at the point (1,3). 


FIGURE 3 



8.2 


MULTIPLICATION of 


MATRICES / 1E3 



right and y units above the origin (0, 0) The total distance from the origin to 
(, x,y ) is V'c 2 + y 2 , called the length of the vector (x, y) Geometrically, it 
is convenient to draw an arrow from the origin to (x, >0 and to think of the 
distance Vx 2 +~y* as the length of this arrow Similarly, in Figure 1(b), 
the point (x, y, z ) is x units toward you, y units to the right, and z units up 
from the origin in three-dimensional space The length of the vector (x, y, z) 
is V* 2 + y 2 + z 2 


FIGURE 1 




The generalization to four-, five-, and higher-dimensional vectors is made 
by analogy to the strictly computational aspects of two- and three-dimen- 
sional vectors, since no geometric interpretation is available In particular, 
the length of an n dimensional vector V - (x lf x 2 , , x„) is defined to be 


\ v \ = Vx? + xl+ + x l 

Geometrically, we make no distinction between row and column vectors 
PROBLEMS 


1. Find the lengths of the vectors V u y,, and ^ above 

by s ^L~;u“;.t srr or addit,on and muitipl,cat,on 

represent vectors enables us to givlZj'' PraC “ Ce ° f draWlng arr ° WS 
operations 8 Ve B e °metnc interpretations to these matrix 


Example 1 If we multiply the vector ri r, , i, , , - 

Figure 2) by 2, we obtain the vectoTo nw } ( ° Wn m P ° Sltl ° n (a) f 
same direction from the origin as (1 3 y , h ) . (p0S,t,on (b)) whlch I,es in f the 
multiply by -2, we obtain the vectm (-2 oT aS f " aWay Bl " \ *! 
as (1, 3) and in the opposite direction from dieougm”" ^ “ ► 


188 / matrices 



PROBLEM S 


2 . Find the following vector products 



3. A monopolist produces commodities A, B t and D having respective 
unit production costs of 50, 10£, and 10 Suppose that he sells 25,000 
units of commodity A, 40,000 units of B, and 62,000 units of D Write 
a three-component row vector R whose elements are the respective pro- 
duction totals and a three dimensional column vector C whose entries 
are the unit production costs Then find the total production cost by 
computing the vector product RC 

4. (a) Two vectors, such as those in Problem 2(c), whose product is zero, 

are called orthogonal, or perpendicular Find two different vectors 
each of which is orthogonal to (1, 2, 1) 

(b) Find two vectors which are orthogonal both to (1, 2, 1) and to each 
other 


Multiplication of matrices is defined, not by multiplying corresponding 
elements, but by multiplying corresponding rows and columns An example 
will best serve to establish the procedure 


Example 4 The product 


AB ■■ 


of the matrices A and B is the matrix 



-\ 

0 


) 

o 

( 6 

0 

-2 ) 

\1 -1 4 J 

\ 2 

3 

J 


C = AB 


_ /— 8 -9 11\ 
\ l 12 13/ 


"•hose entries arc obtained as follows The dement c M = —8 in the first 
row and first column of C is found by multiplying the /irsf row of A by the 
first column of B to obtain the product 


(2 0 —3)1 


—8 


82 MULTIPLICATION OF MATRICES / 131 



Since “the shortest distance between two points is a straight line,” it is 
apparent from bigure 3(a) that the length of the vector V\ + V 2 cannot 
exceed the sum of the lengths of V x and V 2 Thus we have the Triangle In- 
equality for vectors 

\Vx + V 2 \ < \V X \ + \V 2 \ 

(Compare with Property 6 for absolute value in Section 4 2) 

In order to save space, we often denote a column vector V with compo- 
nents a i, a 2 , ifl»by V = col (a Xt a 2 , , a n ) or, in transpose notation, 
by (fli, a 2 , , a n )' For instance, the vector V 3 above would be written 

either V 3 = col (4, 2, 1, 0) or V z = (4, 2, 1, 0)' 

If R — ( r i> r 2 » » r„) and C = col (c ls c 2 , , c„) are row and column 

vectors having the same dimension, the product 


RC = (r u r 2 , 



of R and C is obtained by multiplying the 
two vectors and summing Specifically, 


corresponding elements of the 


RC = r lCl + r 2 c 2 + + ^ _ -g. 

1-1 

For example, the product of R = (2 , 0 , -3) and C = col (- 1, 6, 2) is 


«C _ (2,0, -3)^ 6 j= 2(-l) + 0(6) - 

2 


0(6) - 3(2) - -8 


with column vecloreV^coHn™" 8 modeIs ' Bush and Mosteller* opera 
response R, The requirement ’ P " X where Px ls the P robablllty 

expressed as the vecL product equaton Pr ° bablIltles sum t0 umty can ' 


(U, 



- Pi+ p 2 + + Pn = 1 


Ncw York, I955 ) , ‘ ,0slcllcr ' F ' Modeh for Learning (John Wiley & Sons, Inc , 


RICES 


190 / MAT 



6. (a) Show that the matrices 
I 

commute with 


« f° °\ 
and 0 = L J 
\0 0 / 

-U 


(b) Show that 7 and O commute with every 2 X 2 matrix A 
7. Argue that only square matrices can commute 

The zero matrices O rVf , whose <*mente are ^ of 

:„S “ - oZ .,«> — - 

zeros, then 


9pXrArX c 


. O pX c and A rXr O cX ,j - O, 


'rX<l 


(3) 


In multiplication of real numbers, the ^mbe^T For 

matrices, the Xof^e « » P>^ d * the nances 


/. 0 

0 1 


7nX rt “ 


0 0 


v. 


,0 0 0 

. . „r i „ n H pkewhere 0 In particular, it is 

which have main diagonal entries o identity matrix, 

easily verified that ,f A « an r X c matrix, /, » the r X y 

and 7 C is the c X c identity, then 

IrA - A = AI. (4) 

The matrix 7 r commutes with every r X r matrix A 

PROBLEMS „ . 

8. Go through .he deads of matrix muUiphcal.on to oblain Equations (3) 

and (4) 

Examp, o 6 Suppose thatcachofa depend" 

factor analysis* it is assumed that the score 

_ , i. ...(University of Ctlicato Press, Chicago, 1W7) 

•Thuntonc, L L., UAliple-Fticior (Un J 

32 MULTIPLICATION OF MATRICES / 13X 



Similarly, element c \2 in the first row and second column of C is found by 
multiplying th tfiist row of A by the second column of B to obtain 


C 12 


(2 


0 -3) 



6 - = 


and so on, for the remaining four elements 


► 


In general, the product C = AB of matrices A and B is defined only if the 
number of columns of A is the same as the number of rows of B To find the 
element in the /th row andyth column of C, we multiply the /th row of A 
by theyth column ofB In this way we obtain a new matrix having the same 
number of rows as A and the same number of columns as B Thus, if A is 
' 1S ^ * c> ^ en C = AB is defined only if p = q, in which case 


Example 5 If 

-(-) -CO -(r:.;) 


AB = i 

GO 

AC = 

(1 

8 - 

3 - 

BA = 

/ 0 9' 

) BC = 

( 3 

3 - 


\~1 20 j 

! 

\30 - 

-40 - 


while CA and CB are not defined 


In that example the product rx m “ ,nxmu,u Pheation is not commutative 
«« a„dTA?e both Ah ffX* ° Ka AC is Moreover, 

Particular pairs of matrices do com ^ n ° l equal However * some 


PROBLEMS 


s Show that if 



192 / Matrices 



19. Prove that if A and B are both «-square, then tr ( AB ) - tr (BA) (See 
Problems 16-18 in Section 8 1 ) 

20. Given the following factor matrix F 
Reference abilities 


Tests\ 

A 

B 

1 

0 600 

0 800 

2 

0 700 

0 714 

3 

0 916 

0 400 

4 

0 100 

0 995 

5 

0 954 

0 300 


Compute the intercorrelation matrix R 

8 3 BASIC PROPERTIES OF MULTIPLICATION. 

BLOCK MULTIPLICATION 

Although matrix multiplication is not commutative it associative, and 
multiplication distributes over addition T is mean j m atnx multi- 
that multiplication does not commute, opera ion invo i vlna numbers 

plication and addition arc the same as similar operations involving number 

Theorem 1 Assoc, anve Lew for Ma.nx Muh.phcauo « Let ,1, B, and C 
have dimensions r X <7> (7 X P> an< * P * c ’ respec 
(AB)C = A(BC ) 

i rinnc fnr A B and C, we see that AB is 

^ - - 

dimensions, r X c we must compare corresponding 

As required by the definition of equali y, 
elements The (i,j) clement of ( AB)C is 

l(AB)Cl„ = ± (ABUc i, = ± (t •-*■) (9i 

The (j»j) element of A(BC) is 

U(BC)]„ = t IBO: - t «■< (t ' ::0; 

Interchanging the order of summation in (10) shows that (9) and (10) arc 
identical, which completes the proof 

When A is a square ma.nx, .1 is possible to multiply A by itself to form 
the matrix product 

A « A A 

S3 MULTIPLICATION PnOPEUTIES / 157 



Similarly, element c i 2 in the first row and second column of C is found by 
multiplying the first row of A by the second column of B to obtain 


ci 2 = (2 



6 - 


and so on, for the remaining four elements 


► 


In general, the product C = AB of matrices A and B is defined only if the 
number of columns of A is the same as the number of rows of B To find the 
dement in the ith row and 7 th column of C, we multiply the ith row of A 
y t eyth column ofB In this way we obtain a new matrix having the same 
number of rows as A and the same number of columns as B Thus, if A is 
C ^ an 18 ^ ^ c » ^ en C = ^B is defined only if p = q y m which case 


Example 5 If 


then 





AB 

BA 

while CA and CB 


/‘ 4 1 \ 


/ —2 

8 — 2 \ 

\21 6 / 

AC = 

\ 3 

.-») 

/ 0 9' 

7 20 

j BC = 

(1 

3 - 3 \ 



\30 - 

40 -2/ 


are not defined 


► 


It is apparent from Example 5 that 

In that example, the product CA matr,x multiplication is not commutative 
even though AB and BA are bothd" ? d ? I ! ed ’ eVen thou S h A C is Moreover, 
particular pairs of matrices do comm 0 ™ l ” ey are not ec I ua ^ However, some 


problems 

5 Show that if 

/-I -2 -2 

l 2 l 
\-I -1 0 

then AB = BA 

132 / matrices 


-( 


C, DC . 


-3 -6 2 

2 4-11 c-l 3 

2 3 0 / \ 1 

CB - A, and AC CA ■ />' 


5 -8 O' 
5 0 

2 -I 



6 (a) Show that the matrices 
7 = 

commute with 


/I 0\ (0 0\ 

= ( I and O = I 

Vo 1 V> °/ 


A = 


U 


(b) Show that 7 and O commute with every 2X2 matrix A 
7. Argue that only square matrices can commute 

zr ; “ - « j j— — - 

zeros, then 


and /irx«^ c Xa — ^ r X« 


(3) 


t/px^rxe 

, i the number 1 serves as the identity ele- 

In multiplication of rea numbers, for every re al number a For 

ment in the sense that l a - a > , , , ma , r ices 

matrices, the role of the identity is played by the ulentuy n, Cnees 

0 o\ 


I«X» ~~ I 


0 0 


\0 0 0 

r I elsewhere 0 In particular, it is 
which have mam diagonal entries of identity matrix, 

easily verified that ifdisamXr matrix, 7, is 
and 7 C is the c X c identity, then 

l,A = A = A 7. «> 

The matrix 7 r commutes with every r X r matrix 

PROBLEMS , .. , , ,,, 

„„itmlicalion to obtain Equations (3) 
8 Go through the details of matrix multiplication 

and (4) 

of n subjeets is given I tests In 

Example 6 Suppose that each of a group f b cUIO n testy depends 
factor analysis* it is assumed that the score 

“There, one L I_ VWn.Ve-faelar t s2,u, 

8 , mult. PL, CAT. 0,1 OT »*«■«. / »» 



both on the degree to which subject i possesses certain underlying ability 
factors and the extent to which the test measures these factors. Specifically, 

s Jt — fjlPU f}2P2t + + fjqPqi ( 5 ) 

where = 1, 2, , q) denotes the degree to which test j measures 

factor m, and p mi is the standard score of individual i on factor m It is con- 
venient to scale the observed scores and the standard scores so that 


(6) 


(7) 


t ** = 0 l -± sf, = 1 
— 1 " 

E p™, = 0 1 1 P i, = 1 

11 ,=i 

In matrix form, Equation (5) may be written 
S = FP 

matnv of~r ^} S * 1 X 11 matnx of observed scores, F = ( f . m ) is a / X? 

Z = (p ”' )lsa 9 X "matnx of standard scores 

sum of smiri he = lements each row of S and of P add to zero and the 
sum of squares of the elements m any row ts „ 

e correlation r, k between tests] and k is defined by 
r ‘ k “ l E s„s t , 

“ d ‘ he COm,m ° n between factors m and n by 


In matrix form, 


0 /| PmiPvi 


and Z = - 


(8) 


where R i S the / x t mat f ” 

matrix of interfactor correlations IntCrteSt correla tmns and Z is the 9 X <1 
Substituting (7) into (8) g,ves 


- FPP'F' 


c dt H f '- 


Th . . : • F2F ' 

zero > the factors are then n u ° rre ^ atlons between different factors are 
ha ve n said t0 be uncorrelated In this case Z = /and we 


as the bane equation of factor analyst, i 


FF' 


134 / MATi 



PROBLEMS 


9. If c is a number, a matrix of the form cl, having the number c down the 
main diagonal and zeros elsewhere, is called a scalar matrix Argue that 
an r X r scalar matrix commutes with all r X r matrices 

The only matrix M which has the property that MA = A for all matrices 
A having r rows is M = I r For I r is itself one such matrix and, supposing 
that M has such a property, we have 

MI r = Ir 

On the other hand, we have already seen that 
MIr = M 

whence M = /, A similar statement holds for multiplication on the right 


PROBLEMS 

10. Suppose a matrix E has the property that for all matrices A having r 
rows, EA = O, a zero matrix Prove that E must itself be a zero matrix 
0 Hint What if A = IS) 


11. If 


find 
(a) A 2 
(d) AB 
(g) CA 
12 If R n = 


(:;) -C-0 


(b) B- 
(e) BA 
(h) BC 
, n), show that 
Bn B'„ = l"(" + 
13. If R = (1, 3, 2), verify that 


(c) C 2 
(0 AC 
(0 CB 


0 . 2 , 


R‘ R • 


3 2 
9 6 
6 4 


14. Express the following economic model as a matrix equation 

r - c + / 

C = a + by 
I = u + vY 

monetary values of aggregate con- 

where C and / denote the rcspe d B arc cons iants with A + o * 1 

sumption and investment, undo, , , 

n nhmee relation Cand an irrctlcxivc perceived 
IS Katz* defines an irrcflcxivc choice rcia 

choice relation P by 

•ha,, L , ■ Identification Soc.omc.ric Orouprf Preprint of. he Univcrvily of Michigan. 

SZ MULT.PL.CAT, ON OP MAT RICES / ». 



p x Cpf 4=» person i chooses person j 

PiPpt *=* person k thinks he has been chosen by person i 

He contends that the tendency of person k to identify with person j is re- 
flected in the number n k] of persons by whom person k thinks he has 
been chosen, but who actually choose person j 
Define a choice matrix C = (c tJ ) by 


c s I 1 

otherwise 

and a perceived choice matrix P = (p kl ) by 


if PkPpx 
otherwise 


Show that the numbers n k] are the elements of the matrix product 
N = PC 


16 Let U - (u u u 2 , , u.) and V - „ 2 , „ n ) be two n dimen 

sional row vectors and let * = u - V The A stance between U and V 

i | ="|U - Kf 8 ‘ h ° f * That ,S> ,he d,slance between V and V ,S 

(a) Prove that 


\U- V\ 


~ «l) 2 + («2 “ U2) 2 + + («„ — V n ) 2 


^ ^ nan^form^ d,Stance between u and v can be written in the alter- 


|tf- V\ - 


VU/ - V){V ~ V) 

arasr 8 and the m,erpre ' 

”• ^7wtT> n Tr UOndhydW - l ' > - 1 Prove that 

(a) d(U, F) > 0 , for all vectors U and V 

(b) d(U , V) = o, jf and on , y 

(c) the Triangle Inequality d(U iv\ j/„ 

tors U, V, and W * » "0 ^ d(U, V) + d(V, W), for all vec- 

18 Fnc stimulus objects were n ipH „ 

follows ated on three factors x\, X 2 , and X3 as 


Object 

Factor x t i 

Factor x 2 l 

Factor x 3 3 

Compute the distance matrix D 
tween objects / and J 


4 10 
0 6 
2 3 


(<f j) where d ti is the distance bc- 


195 / MATRICES 



19. Prove that if A and B are both n-square, then tr ( AB ) - tr (BA) (See 
Problems 16—18 in Section 8 1) 

20. Given the following factor matrix F 
Reference abilities 


Tests\ 

A 

s 

B 

1 

0 600 

0 800 

2 

0 700 

0 714 

3 

0 916 

0 400 

4 

0100 

0 995 

5 

0 954 

0 300 


Compute the intercorrelation matrix R 

8 3 BASIC PROPERTIES OF MULTIPLICATION; 

BLOCK MULTIPLICATION 

Although matrix mult.pl.cat, on is not commutative ,t 's assocmUve and 
multiplication d.stnbutes over add, non Thts means tha ‘> e ] X n , ; ep '; a f ° r 
that multiplication does not commute, operations invol vmg matrix muh 
plication and addition are the same as similar operations involving numbers 

Theorem 1 Assoc, a„ve Lew for Mawx M„l„pl,ca,,o„ Le, ,1. B, and C 
have dimensions r X q, <7 X p, and p X c, respec i\e 
(AB)C = A{BC) 

d Trequned^the definition of equality, we must compare corresponding 
elements The (t.y) element of ( AB)C is 

( 9 ) 


)Ck, 


KAB)Q. I =i(AB),kCk l -t(i‘‘-‘ba) 

The (i,j) clement of A(BC) is 

[A(BC)]„ = t = t (£ 

Interchanging the order ofsummation m (10) shows tha, (9) and (10) arc 
identical, which completes the proof 

When d ,s a square matrix. ,t ts possible to multiply A by itself to form 
the matrix product 

A" *=* A A 

83 MULTIPLICATION PROPEBTIES / 137 



PtCpj <=> person i chooses person j 

PtPPi <=> person k thinks he has been chosen by person i 

He contends that the tendency of person k to identify with person j is re- 
flected in the number n kj of persons by whom person k thinks he has 
been chosen, but who actually choose person j 
Define a choice matrix C = (c„) by 


c fl if PxCpt 
(O otherwise 

and a perceived choice matrix P = ( Pkl ) by 


Pu - 


if PkPpx 
otherwise 


^iow that the numbers n kj are the elements of the matrix product 


16. Let U = («i,«2, ,u n ) and V *= (01,02* ,o n ) be two n-dimen- 
sional row vectors and let X = U - V The distance between U and V 
engt ^ * That is, the distance between U and V is 

(a) Prove that 


\U- v\ 


' “t) 2 + (»2 - t>2) 2 + • • + (»„ - t>») 2 


tW native 'form 112 d,Sla " Ce be,wecn V and V can be ' vmIen ln ,he aller ‘ 


It/- v\ , 


■ V | Prove that 


= Vu/ - V)(U' - V) 

8 - - — ~ 

n. Drrine a distance fanction d by d(U yi _ I n _ 

(a) d( U, V) > 0, for all vectors U and V. 

(b) d(U, y) - 0, it and only i[ u = y 

<C) lo^ a K!a„d C ,r 1,,y ' dW ’ W> * <0 + >n for all vec- 

18 ' roW mU,l “ ° bjCC1S — — -a three factors and ,3 as 


Object 

Factor a:, j 

Factor x 2 I 

Factor x 3 3 

Compute the distance matrix D » 
tvseen objects l and J 


10 

6 

3 


Ww) where d,j is the distance bc- 


isa / MATRICES 



Example 2 A manufacturer produces three products a, 0, and 7 which he 
sells in two markets Annual sales volumes are indicated by the matrix 

a P V 
Market 1 / 10,000 2,000 18,000\ 

2 = Market 2 \ 6,000 20,000 8,000/ 

If unit sales prices are given by the vector 

a P y 

p = ($2 50 $1 25 $1 50) 

then the total revenue in each market is obtained from the matrix product 

( 10,000 6,000\ 

2,000 20,000 j= ($54,500 $52,000) 

18,000 8,000/ 

Similarly, if the vector 

a p y 

C = ($1 80 $1 20 $0 80) 

lists unit costs for the three commodities, then the total costs (corresponding 
to the individual markets) are given by the matrix product 

( 10,000 6,000\ 

2,000 20,000 J= ($34,800 541,200) 

18,000 8,000/ 

Making use of the distributive property of matrices, we can express the 
\ecior of gross profits as 

PQ' - CQ' = (S19.700 S10.800) = (P ~ C)Q' ► 


PROBLEMS 

-c:”) 

Verify that AB + AC - AID + O 

Theorem 3 Transit of a 

product of the transposes in mi rat oru<. . > 

*“ ,Xflh “ nff.W l ”> 

83 MULTIPLICATION PROPERTIES / 153 



called, naturally, the square of A Then 

A 2 A =* (A A)A = A(A A) = A A 2 

follows from associativity This means that the matrices A and A 2 commute 
and that the matrix product AAA produces the same result regardless of 
which pair is multiplied first A similar argument may be made in general 
and we define the nth power A n of A as the product of n matrices, each of 
which is A For convenience, we also define A 0 to be the identity matrix / 


Example 1 If 




Associativity means that the usual rules of exponents apply to matrix 
multiplication That is, 

A n A m = A m A n = A n+m 


holds for all non negative integers m and n 


PROBLEMS 
1 If 




find A 2 , A 3 , and A* V e rl |y that a A i _ A , A2 

0 and C bo* hllTdmcLwnfx f The" ^ ^ ? ma ‘ m ^ 

A (B + C) = AB + AC 

the definitions “f addmon^d' ^1^1™,^“” ' X ' US ‘" S 

, 

- £ («a + a aCll) = £ „ :kbti + £ a 
" + (AC),j = ( AB + AC) 

1 ms completes the proof 

133 / MATRICES 


I kCk} 



Suppose we partition A into two submatrices A = (A , A 2 ) where 

/-3> 


-C-3 


and A 2 


and partition B as 


/Bl 

\B 3 bJ 


where 


"■-("I) *-C-3 *■" 


= (2) = 0 1) 


These partitionings are indicated by the dotted lines in (12). If we now Pre- 
tend that A is a matrix having only two elements and B a matrix with four, 
we might form the product 


04 i A 2 ) 


(B l B 2 \ = 
\B 3 BJ 


(A\Bi d - A 2 B 3 A l B 2 + A 2 B 4 ) 


( 13 ) 


Note that since we partitioned the columns of A m the /^“sense'as 
rows of B, all the products A\Bu A 2 B 3 , At 2 , a 2 _ — j wh j| c 

matrix multiplications. Moreover, A,B, and 2 3 as’matrix 

A,B 2 and A 2 B 4 arc both 2 X 2. Thus the sums also make sense 

addition. 

Simple calculation shows that 


- (_i) A '"‘ ■ ( ■) 

-cri) 


—a 


Hence 


iB t + A 2 B 3 = (^ ^ and A,B, + A 2 B 4 ( 12 13 ) 

and the matrix product becomes 

/-Si - 9 Il\ 

V 1 i 12 13/ 

► 

V'hich is the same as AB. 

Details of the proof that the onducdMIoacscr, 

*orks arc straightforward but rather leng > 
v* ii| stale the general theorem. 

8.3 MULTIPUCATlOM PROPCHTIES / 201 



PROOF Since AB is r X c, B' is c X q, and A’ is q X r, it follows that both 
sides of (11) have dimension cXf Then 


((AB)% = (AS),, = Z flrfh. 

fc = l 


(£'/(')„ = E = z h.a,t 

1 t=l 

The two sums are obviously equal 
Example 3 In Example 2 

( 10,000 6,000^ 

2,000 20,000 

18,000 8,000/ 

= (519,700 510,800) 

Applying Theorem 3, we obtain 
K p - QQ'l' = Q (P - cy 


/10,000 

2,000 

18,000\ 

1 60 
< O 

1 ~J 

tO, 

, /$19,700\ 

\ 6,000 

20,000 

8,000/ 

1 1U05 

\ <rn -7n / 

' ~~ \$ 10,800/ 


than the ““sMy “ Prob,em 2 ■ “mpute fi'rl' and verify that 

matrices or MocAi'a !!'* 5 '* °/ lCn helps *° P a '“«°n the matrices into sub- 
tecs, Hocks An example will best indicate the procedure 

Exempt 04 Example 4 of Section 8 2, we found the product of the 


/2 0 j_3\ /-> 0 7 

\1 -1 j 4 ) and B “ 1 6 0 -2 

\ 2 3 1 

V ■ 12 13/ 


/ matrices 



is 2 X 2 Then 


: P 2 P 


-cxk; 
-c :«xh 


and a simple induction shows that 

( 

U + TA + +T" 'A T" 


P n 


O' 

TA T" 

I O 

A + TA + T 2 A T 3 


Y, T l A T " 

k=0 


where T° - I This approach is obviously much more efficient than a direct 
application of the basic definition of multiplication r 


Example 6 Need for block multiplication also arises in problems of large- 
scale computation For instance, the multiplication of two 50 X 50 matrices 
each of which contains 2500 entries, requires 125,000 multiplications and 
almost as many additions One method of performing such calculations on 
a computer whose storage capacity is limited is to partition each of e ma- 
trices into smaller matrices If the blocks are suitably small, the machine can 
successively compute products of these blocks, store the results on tape to 
clear the internal storage for the next block of calculations, and finally call 
back the individual blocks one at a time for summation 


PROBLEMS 


If 

A 0 

a A 


[°L 

c d\ (, ^ 


i 0 o 

_1 0 } \0-l) 


\o 0 

0 -1/ 

show lhat regardless of the values of a, b, c, and d, we have ,1= = frxt 

. If 



'-CD 


A -A** °) 

^ 1—1/ \ ° - /2 X2/ 


/ I y' 



K o ' 

and c " { Ik l) 

show that A 2 = B 2 
CB « -2/4x4 

= c- “ 

I W , AD + BA - AC + CA - DC + 

83 

MULTIPLICATION properties / 203 



Block Multiplication Theorem If the r X q matrix A is partitioned as 



and the q X c matrix B is partitioned as 



and if the columns of A are partitioned in the same way as the rows of B (so 
that all matrix multiplications make sense), then multiplication of the par- 
* c°t# C ^and B using the blocks as individual elements produces the same 
A TndB the baS ' C definlt ' on of matr,x multiplication of the nonpartitioned 


apncTln muU ' phcat,on > s especially useful when special patterns 

tK-y. which anses m pr " 

/ 


Partitioning as indicated, we 


1 0 o 

o 

© 

0 1 0 

o 

© 

0 0 I 

o 

o 

0 i * 



3 3 / 

i 1 1 

1 i/ 


write 

P 


2 X 3 malm 3 X 3 ‘ dCn “‘ y malr,x - ° ls a 3 X 2 block of zeros, A is the 


and 


-n 

tnx, O is a 

CIS) 
r -(l D 


202 / mat hi 



find the following 

(a) P 2 (b) P 3 W (d) pn 

(cl Find hm^P”, where by the limit of a sequence of matrices we 
W mean the matrix whose clement ,, for each pair 0, , the = Itmtt 
of the sequence of numbers appearing in the (/,;) position of P, P , 
P\ 

13. If A and D are both n X « matrices, prove that 

(. A + B)(A - fl) X A 2 - B- unless AB = BA 

14. Prove that if A is a square matrix, then 

7l(« — 0 .2 , i A n 

(/+/!)' = /+ + 2 ^ + + A 




(Here /4° =* /) 


8 4 APPLICATIONS TO DIRECTED GRAPHS AND 
TO INPUT-OUTPUT ANALYSIS 

The theory of d, reeled graphs has found extensive 

small groups and of formal organizations Sociometr P ’ f 

tion networks, cliques and subgroup formation, .atterns of ,n 

fiuence, and Interdepartmental tnterac.ton are t M amenab e to ^udy 
these methods Here we are primarily concerned w.thshowtng 
manipulations can be used in the analysis o tree e 

Let R be an trreflex.ve relat.on defined on a set of objects 

y = {u i, o 2 , . “»} 

We have seen (Sect.on 3 5) that such a relation the 

by a directed graph (digraph) which consis s o p joining pairs of 

objects o„ o 2 , , o„ together with M ^"° in pfg^re 4(a) 

points For instance, the relation R depict y 

IS such that 0^02, 02B03, and O1B03 ^ rdating objects o„ o 2 , 

The adjacency matrix ArO i a given 1 
v n is the n X n matrix with elements a t} define y 

1 if v,Roj 

0 otherwise 


LIGATIONS / 205 



6 A square matrix of the form 


0 

0 


m which all dements ofT the main diagonal are zero, is called a diagonal 
matrix The identity matrix and all square zero matrices are diagonal 
We abbreviate (14) by writing D = diag (d u d 2 , , d„) 

(a) Show that any two diagonal matrices of the same order commute 

(b) Show that if D - dug (</,, d 2 , , d n ), then 

D p = diag (</?, dl t , dl) 

That is, the yith power of a diagonal matrix D is a diagonal matrix 
in which the diagonal elements are the pth powers of the elements 

7. Verify that if ,1 is the matrix 


and R is the matrix 

/ UVl j/s/3 i/v/TX 
1-2/V6 1/V3 o 

\i/V6 

thin R AR = dug (4, —2, —2) 

Uio . diag Wi.r/i,*) Write a matrix equation for Z?- id 

9 Simplify {A' O' + 2C)' 


, V 2 -* -/ \ 

and c is any nunibtr, thow that 

1<vl + (1 - OflJ-t ■ ; 


1 " -■< dj and B 

- Her ‘■mr-v, 

/ uatriccs 


/ \-6 6 -ij 

~ Al, While ( A - ft) 2 - 0 



Example 1 The square of the adjacency matrix A in (15) is 


B = A- 


V2 I>3 U 5 

0l /o i i o o\ 

u 2 / 0 0 0 0 o\ 

u 3 0 0 0 0 0 I 

vA 0 0 1 o 0 / 

Ds \o 0 0 0 0 / 


I- «-■ - ' ;rr “ 

13 i *inu W43 in pnch of the above instances, the 

two from Ui to 03 and one from v 4 to u 3 In 
sequence is also a path since no point is repeate 


PROBLEMS 

s rss- 5 "»™ - — — “ 

digraph 

r=rs,» u s 1 --- ssjSEsss 

arises as follows Let *, and «i b e» Then the adjacency 

zzX'X - - * — “ s 

-4ft, and yJ K , for the relations itj and i ?2 » that is, if 

Similarly, if the relation S is ‘^/"'^jacency matrix' -4 S of 5 is 

52^^5S^SS£- ^ — * “ 

_ _ « d whose digraphs are shown in Fig- 

Example 2 The relations R 1 and -Re w s 

ure 5 have adjacency matrices 


and Ar 2 1 


•After George Boole, nineteenth-century English mathenu 

84 APPLICATIONS / 207 





FIGURE 4 



GO (b) 

The matrix R in Example 1 of Section 8 1 is an adjacency matrix for a 
preference relation The adjacency matrix for the relation whose digraph 
appears in Figure 4(b) is 


h 

Vi I 0 


4 = V 3 
Vi 

Vs 


°2 O 3 Vi 

1 0 1 


0s \ 

o ' 

0 0 10 0 

0 0 0 0 0 

. 0 1 0 0 0 . 

\o 0 0 0 0 f 


(15) 


autr„Ts e eq °Sf of?ri a n " 'T' T 1 "' *° a ‘ erm,nal P ° mt 15 

”2 0 3 V + T f° and « directed Iin « o,, U! -> o 2 , y 2 , 

a point or a dire<*t(lu!n Un ’ ° n ^ re 1S no restn ction on the number of times 

^5 " s e °; u a ^^S ” ‘ he Set)U — A P°'" from »• *° 

path with the same imt l ^ may OCCUr more than once A sequence or 

Let A be a given ad'iacl 0 °^ term,nal P omt is said to be closed 
gtven adjacency matrix and cons.der the element 


f, = Ona,, + a,2a 2j + 


+ a tn 


in the ij position off = A 2 

n.t and a t , are both equal to 1 ih , ’ “ product a .r°tj is nonzero only if 

two leading from u, to o It foil 3 ls ’ oni y 'f there is a sequence of length 
length two from u, to u 1 ° WS ™ at 1S the number of sequences of 

In a similar way, we see that the element 


= /‘lOl; +f 2 a 2) + 


+ f n a„ . 


m lIlc V position of G = A z = F j 
length three which lead from v to * nd,cates the number of sequences of 
lowing general result * V} s,m Ple induction establishes the fol 

205 / matuices 



requires a fixed quantity a„ of the ith commodity as input These mput co- 
efficients a„ may be arranged in an N X JV matrix as 
output commodity 


A = 



(16) 


N yfyri on 2 


in which entries in the yth column (j 1,2, , JV) specify the input 

quirements for production of one unit of theyth commodity This input 
output model is the Leontief model * 


Example 3 The first column of the input matrix 

output 


A = 


i 


indicates that the production of one unit of oonuMd^ 1 «qum 
commodity 1, J unit of commodrty 2 and none ofco ^ ^ ^ 

quirements for commodities 2 and 3 may De 

The simplest Leontief model arises when the output of each "y serves 

only as mput to the 7 closed Leontief system 

odier inputs are require ££ £»„£* of the , th mdus.ry Since pro- 

Let x, (j 1, l, , > output, the total quantity re- 

duction of x } requires a l} x } units ot me y 

quired of output i is, for t — 1» 2, , N* 


, = a % \X\ + 0,2X2 + 

N 

= £ a '> X ’ 


+ a,\x\ 


(17) 


The A equations (17) rnaybcwn.tcnm , ^tintctor 

is the input-output matrix (16), Y - ■ «»* O’- the vcctor 0 f outputs 

'rssi- 

apparent that the system cannot function when an , ’" P “ u '7 S ,ch 

exceeds output *, A closed system which can function, that is, Tor wlucn 

•After Wassily Leant, cf uho p, entered this approach to the analysis of economic syslcms 

84 APPLICATIONS / 208 



v 2 


FIGURE 5 (a) Digraph of Ri (b) Digraph of R 2 



(a) digraph of lb) digraph of Ri 

The Boolean sum Aa t © Ar 2 and element by element product A Rl ® Ar 2 
of these two matrices are 



These correspond to the digraphs shown in Figure 6, which should be com- 
pared with those in Figure 5 ► 


FIGURE 6 (a) Digraph of R t V R 2 (b) Digraph of R, A R 2 


*2 



(a) d, * I3 P h of *» or *2 (b) digraph of and R 2 


Now let us examine input-output analysis Consider an economy com- 
posed of AT industries each of which produces a single commodity Each 
commodity may reenter the system as an input to any or all of the industries 
Specifically, wc assume that unit production (output) of the jth commodity 

203 / MATRICES 



requires a fixed quantity a„ of the .th commodity as input These mput co- 
efficients a„ may be arranged in an N X N matrix as 

output commodity 

1 2 W 


fa 11 au 
a _ g -il am "22 

A ~ e 


(16) 


| N\a\ l «V 2 

, , / . _ i o N) specify the input re- 
in which entries in the jth column (y > • ■ modlty This mput- 

quirements for production of one unit of the jth commo 

output model is the Leontief model * 

Example 3 The first column of the input matrix 

output 


& 


indicates that the production hi ^^^^soTof'^inttnldiP' 3 Similar rc- 
“,2 r«io— L, he mad .tom the -he. ...Uhl". ► 

thp outDut of each industry serves 
The simplest Leontief model arises w (he econom ic system and no 

only as input to the various industries wi ^ g c / ose( j Leontief system 

other inputs are required In this case we rc mdustry Since pro 

Let x, 0=1,2, ,N) denote the outpmofjh^ ^ ^ qua „ tlty re . 

duction of x, requires a t] Xj units of t 
quired of output i is, for / = U 2, * ’ 


= 0 , 1^1 + fl * 2*2 + 
\ 

= £ a tJ Xj 


+ a,\x\ 


(17) 


1 form us y = , where A 

The N equations (17) may be written in nu nx ^ JS a co i un in vector 
‘s the input-output matrix (16), Y - co1 O ' 1 ' y2 ' ^ ^’ s thc VC ctor of outputs 
of input requirements, and X = col ’ outsK ] c thc system and it is 

In a closed system, there are no ,n P“ whcn an , n put requirement >, 
apparent that the system cannot funct,on function, that is, for which 
exceeds output A dosed system which can tunc 

'After Wassily Lconlief a, ho pioneered Ihis approach to ibc analjsii of economic systems 

.4 APPLICATIONS / 20D 



there is a vector X > O such that 

Y= AX< X 

is called viable Here the symbol O denotes a vector of zeros The matrix 
inequality D > C means that for each i and j the ij elements b v and c t} of 
B and C satisfy the inequality b tJ > c l} Thus X > O means that no element 
of X is negative, reflecting the fact that negative output is impossible A 
vector X > 0 such that 

AX = X (18) 


is called an interior equilibrium of the system Methods for determining 
whether a system is viable and for finding interior equilibria will be dis- 
cussed in Chapter 9 

An open Leontief system consists of a productive sector of N industries as 
above, and a nonproductive, or consumer sector, which creates a demand for 
commodities over and above their use as inputs for production The vector 
Y = AX again represents the inputs required by the productive sector The 
difference 

X — AX - (/- A)X 

thus represents the net output available to the nonproductive (open) seg- 
ment of the economy The fundamental problem is whether the economy 
can satisfy the demands of the open segment That is, if D is the demand 
vector of the consumer sector, the problem is to determine whether there 
exists a vector X > 0 such that 

(/ - A)X = D (19) 

This problem, too, will be solved in the next chapter 
PROBLEMS 


lrreflexive relation, say the relation • chooses, ' and let A 
be the adjacency matrix of this relation Prove that the number of per 
sons chosen by both v, and u, is the element in the ,j position of the 
matrix product AA' Interpret the diagonal entries 

3 fn r „ 8Ue n hat i! n ,h = s,tua,, °' 1 presented in Problem 2 the number of per 
sons who choose both c, and ,s g lve „ by the ,j element in the product 
A A Interpret the diagonal entries in this product 

4 Show that if there is a vector X > o such that {A — l )X = O, then X 

an interior equilibrium of a closed Leontief model 

5 rntm's"' 3 ' m “ U,ree '" dUStry ° pi:n ““““my. »h= input coefficient 


02 03 02 
A = 04 01 02 


01 03 04 


If the output vector is col (10, 6, 5 ), find 
open sector of the economy 


the net output available for the 


210 / MATRICES 



6. Verify that the vectors col (1,1,1) and col (2, 2, 2) nre both interior 
equilibria of the closed Leontief system with input-coefficient matrix 

( 0 2 0 3 0 5 \ 

04 04 02 J 
0 1 05 04 / 

In fact, show that every vector X = col(xi,x 2 ,«) with positive en 
tries, and xi = « = *3 is an interior equilibrium 

7. In a digraph for a relation Jl defined on a set of points ui.ua^ u^a 
point I), is said to be reachable from a point r,(u, »,) 
from u. to u, We agree to say that any point is reachable from itself by 

path of zero length, so that T is reflexive 
(a) Argue that T is also a transitive relation 

(b) Give examples to show that T,s neither symmetric nor asymmetric 

8. In Problem 7 , , 

(a) Argue that if v, is reachable from it can be reac e in 
W fills or less! where « is the number of points in the graph 
(h) Let A be the adjacency matrix for the relation ^ from 

element in A k by aff Show that (a) imp <o 1 m is 

*, if and only if at least one of the elements ^ ’ ’ y 

(c) "denote the adjacency (reachability) matrix for the relation r 
defined by 

|1 if cjc, 

b " = (o otherwise 

_ ,, „ . j»-t (Boolean addition) 

Showthatfl=/©/f©rI z ©---© /I 

9. In Problem 8, let two points c, and c, communicate O-.Ci.,) 
reachable from the other 

(a) Argue that C is an equivalence relation g wh( . rc B a 

(b) Show that the adjacency matnx or ^ product obtained by 
the reachability matrix and ® the sum 0 f the elements 

fflKTSTSL — of points Which communicate 

(c) Develop a matrix representation for .1* number of points which 

communicate through paths of on y on 

10* From the digraph in Figure 7 compute 

(a) the adjacency matrix 

(b) the reachability matrix 

(c) the communication matrix 

84 APPLICATIONS / 211 



FIGURE 7 



11 . A social behavior model due to Rashevsky* postulates that the satisfac- 
tion accruing to an individual increases with the effort of everyone in the 
society, but decreases with the amount of effort personally expended 
Symbolically, the satisfaction S t of the jth individual in an N-member 
society is given by 

N 

S* — Oi '£2 X]Pj t b x x x 

j-l 

where x, denotes the amount of effort exerted by the jth individual, Pn 
is a measure of the effect on individual i of effort by individual j, and 
a t and b t are constants associated with the jth individual 

(a) Let S ■= (Si, 52, ,Sn) be the row vector of individual satisfac- 
tions, P = ( P)y ) the matrix of the coefficients p ]l3 A be the diagonal 
matrix (see Problem 6, Section 8 3)^ = diag (a u a 2 , ,w), 
x 7 ™ * 2 * ’ lhe row vect or of efforts, X D the diagonal 

matnx X D = diagfri,*,, ,x N ), and B = (61,62, ,hd 
Show that 

5 *= XPA — Bxl 

(b) Rashevsky defines a higher satisfaction 5, for individual 1 as the sum 


3, = £ k x ,S, 

j=i 

th^satKfaet _^ e ^ e ^ ee to which individual 1 is concerned with 
the satisfaction of individual j Show that 


5 = K[AP'X> - xlB'] 


where 3 is the vector of higher 
matrix of concern coefficients 


satisfactions and K = (*„) is the 


Models," m S <? me of Rzhevsky's Social Behavior 

Press, Glencoe, 111 , 1954), pp ,e Soc,al Sciences, Lazarsfeld, P F , Ed (Free 


212 / MATR1CE 



SUPPLEMENTARY READING 


Harary, F , Norman, R Z , and Cartwright, D , Structural Models An Introduction 
to the Theory of Directed Graphs (John Wiley & Sons, Inc , New York, 1966), 
Harary, F , “Graph Theory and Group Structure,” in Readings in Mathematical 
Psychology ; Luce, R D , Bush, R R , and Galanter, E , Eds (John Wiley & 
Sons, Inc , New York, 1965), Vol 2, 225-241 

Horst, P , Matrix Algebra for Social Scientists (Holt, Rinehart and Winston, New 
York, 1963) 

Labovitz, S , “Application of Matrix Algebra to Social Units,” Sociology and 
Social Research 51, 220-234 (1967) 

Zelinsky, D , A First Course in Linear Algebra (Academic Press Inc , New York, 
1968) 


SUPPLEMENTARY READING / 213 



LINEAR EQUATIONS 

AND LINEAR PROGRAMMING 


91 LINEAR EQUATIONS 

The system of equations 

a ll*l + O12X2 + + 0 l n x n = Cj 

021*1 +022*2 + + 0 2n *n = C 2 l 1 ! 


0„1*1 + a ,2*2 + + 0„„*„ = c„ 

IS a system or,,, i, near equallons ,, varlabIes Such t be re . 

written in the matrix form 

AX = C 

where 



ii c/or ofroim ^ / hC mmnx of coc £ !c ' el,,! ’ the Leclor of unknowns, and the 
T for ShXjT aS “ c,alcl1 <!>= equations Any n dimensional vector 
find M sl.iol = 15 Ca " Cd * S ° ,U “°" 0f ,he «“»* Tha P roblem 15 l ° 


The most important technique for soiling 
invohes suectssivc elimination of variables 


a system of linear equations 


214 / LINEAR P 


ROG RAMMING 



Example 1 To solve the system 

*1 — 2x2 + 3X3 = 4 

— Xl + x 2 + *3 = 0 
2x\ +1X2- * 3=1 

we begin by eliminating *i from each equation except the first To accom- 
plish this we add the first equation to the second and then add 1-2 times the 
first equation to the third equation These operations produce the new system 

Xi — 2 x 2 + 3*3 = 4 

- *2 + 4X3 = 4 

6x2 — 7X3 = —1 

In this system we first multiply the second equation through by -1 to 
obtam x 2 - 4*3 = -4 < 3 > 

Then we eliminate x 2 from the first and third equations by adding twice 
Equation (3) to the first equation and -6 times Equation (3) to the third 

The result is ^ _ 5x3 = -4 

x 2 - 4 X 3 = -4 
17X3 = > 7 

,-7 v, = 1 . adding five times this 

Dividing the last equation by equation to the second 

equation to the first equation and four times til q 

equation gives __ i 

Xi 

X 2 “ 0 

x 3 = 1 

as the only solution to the original set of equations 

Example 2 The same procedure may be followed in solving the system 
x, - 2X2 + 3X3 = 4 

, r . *3= 0 < 4 > 

— Xl + ^ 2 + -*3 

2 xi + 2x 2 - 18*3 “ ~ 16 

.«* ihr* nmc as the system in Example 1 
which, except for the third the second and third equations 

As before, n\c start by eliminating *i 

- 2*2 + 3*3 = 4 

-* 2 + 4 x 3 = 4 18) 

6 x 2 - 24X3 = -24 

9 , LINEAR EQUATIONS / 21S 



Then, we multiply the second equation by — 1 and eliminate x 2 from the 
first and third equations 


xi - 5.y 3 = -4 

x 2 — 4x 3 = —4 (6) 

0 x 2 "h 0x 3 — 0 

The third equation in (6) consists of nothing but zeros because in the pre- 

ceding system (5), the third equation was proportional to (—6 times) the 
second equation Since this equation puts no restriction on the unknowns, 
it may be eliminated, leaving 


*1 - 5x 3 = -4 

x 2 - 4x 3 = —4 

We now have but two restrictions on three unknowns Obviously one of 
the variables, say x 3 , may be assigned any value whatever, after which cor- 
responding values of x t and x 2 may be determined For instance, if x 3 is 
assigned the value 1, then 

*i = -4 + 5(1) = 1 and x 2 = -4 + 4(1) = 0 
In general, if x 3 = 6, then 


x 2 ■■ 
*3 ! 


-4 + 46 


(7) 


— o 

£ vV P^S,T! entS the f ° rm Wh,Ch aU 50lu, '°“ of Equations (4) must 
UCU>ar S0,mms are obtained by assigning specific values to b ► 

Example 3 Consider the system 


2x 2 + 3x 3 = 4 
~ Xl + x 2 + x 3 = 0 
2xi + 2 x 2 - 18x 3 = 0 


(8) 


- ** side ° f the third 


XI - 2x 2 + 3 X3 = 4 

~ -*2 + 4x 3 = 4 

6*2 — 24x 3 = — 8 


216 / tINEA 


a programming 



and then 

Xl - 5x 3 = -4 

x 2 - 4x 3 = -4 

0x 2 + 0*3 = 16 


There are no possible choices for x„ x 2 , and x 3 which will make Ox, + 
0x 3 = 16 It follows that there are no solutions and that the system (8)^ 
an inconsistent system of equations 


PROBLEMS 

(b) x - y + z = 4 

2x - 4y + 3z = - 1 
— x + 3y - 2z = 0 

(c) x - y 4- z = 4 

2x - 4y + 3i = -1 
-x + 3y - 2z = 5 


1. Find all solutions of the systems 
(a) x — y 4- z = 4 

2x - 4y 4- 3z = -1 
- 2y 4- 3z = 0 


In solving the systems of linear equations in Examples ij-3 ' 
performed only two basic types of operations ci -uation to 

equation by a nonzero constant or addition of a mu ip nerforimns 

another operation of the same type Specin y, the 

piled by C%s 0, the original equal, on may be — J * 
new one by 1/e If c times the ,th equation is added Ihe 
original jth equation is obtained by adding c mie f eqU ations 

newjth equation Since, for each operation^ e “'8^^ fol|ows that a „ 

may be obtained from t e new sy - oto „ an d have the same 

systems of equations obtained in tins way a y 

solutions oneratmns on a system of linear equations 

In performing a succession of of t y he vanab lcs m one equa- 

we either reach a point at which all coem , n j n t u e former 

tion are zero (Examples 2 and ^ ) ° r f W ‘j “ " | ° s | de 1S a ? S o zero (Example 2) 
case we may eliminate that equation if th g 

or decide that the system of equations has no solution f ^ "Sht side s no^ 
zero (Example 3) In the 1-t.cr case - either, ob ^ 

(Example 1) or many so ut.ons « , rthcre Jrc any 

cessne elimination will always work, 6' 1 S „ 1S , The method thus 

and leading to an impossible equation ir ' , computationally 

has the threefold advantage of being logic y i 
straightforward, and foolproof 


9 1 LINEAR 


EQUATIONS / 217 



Then, we multpiy the second equation by —1 and eliminate *2 from the 
first and third equations 


*1 - 5a - 3 = -4 

A 2 — 4*3 ~ —4 (6) 

0* 2 + 0x 3 = 0 

The third equation in (6) consists of nothing but zeros because in the pre- 
ceding system (5), the third equation was proportional to (—6 times) the 
second equation Since this equation puts no restriction on the unknowns, 
it may be eliminated, leaving 


*i - 5*3 = -4 
* 2 — 4*3 = —4 

We now have but two restrictions on three unknowns Obviously one of 
the variables, say * 3 , may be assigned any value whatever, after which cor- 
responding values of *j and * 2 may be determined For instance, if X3 1S 
assigned the value 1, then 

xi = -4 + 5(1) = 1 and * 2 = -4 + 4(1) = 0 
In general, if * 3 = b, then 


*1 4 + 5b 

*2 = -4 + 4 b (7) 

*3 = b 


ha h “ y ScfcT n ‘ S ‘ he r ° rnI Whlch a » s °I utI °ns of Equations (4) must 
UCU ' ar are otorned by assigning spec, tie values to b ► 


Example 3 Consider the system 


*■ - 2*2 + 3x 3 = 4 

+ *2 + x 3 = 0 (8) 

2*! + 2 x 2 - 18x 3 = 0 


lhe the th,rd 


- 2x * + 3*3 = 4 

- *2+ 4*3= 4 

6*2 — 24*3 = —8 


* PROGRA 


216 / LINEA 


MMINq 



has the following matrix of input coeffi- 


13 A three sector open economy 
cients 


Producer 


Consumer 

Manufacturing 


Agriculture 


Services 
Manufacturing 
Agriculture 


01 

04 

02 


04 
0 05 
03 


03 

05 

01 


If the demands for the three sectors from the nonproductive sector of 
the economy are 10, 25, and 20 billion dollars, respectively, find the out 
puts needed to satisfy these demands ( Hint Use Equation (19) m 
Section 8 4) 


14 The economic model 


Y = C+I 
C = a + bY 
I ^ u + oY 


was introduced in Problem 14 of Section 8 2 

(a) Express this system in matrix form 

(b) Solve for Y, C, and / in terms of the constants 


9 2 ELEMENTARY OPERATIONS ON MATRICES 

By now you may have nohcedthal m 

the scalars c, In Example 1, for instance, we J 




PROBLEMS 


In each of Problems 2-9, find all solutions of the given system of equations 


2. *1 4* 3*2 ** 6 
2*1 + 4*2 = 10 
4. *i + 6*2 = 0 
4*i - 12*2 - 0 

6. *i + *2 4- 2*3 = *i 

— *i 4- 2*2 4- 3*3 = 2*2 
-2*1 — 3*2 4* *3 = *3 
8. * 4- 3y 4- 2z 4- 7/ - -5 

3* 4- ly 4- 2z 4- Ilf = 1 

2* 4- 5y 4> 3z 4- 12/ = -7 
-* 4- z 4- / = -6 


3 * - 3 y = 4 
2* 4 - 6y = 12 
5. *i 4- 2*2 — 4*3 = 0 
2*i 4- 4*2 — 2*3 = 0 
7. * 4- 2y - 3z = -2 

4* - y 4- 2z = 8 

13* - y 4- 3z » 22 
9. * 4- 2y - 3z = 0 
3* — y 4- z = 0 
x+ y + z - 0 


10. A plant uses two machines M i and Afo m the production of two products 
P and Q Due to other production scheduling, machine Mi is available 
12 hours per day, while M 2 is available only 9 hours per day To produce 
one unit of product P requires i hour of Mi time and 1 hour of Me 
time Each unit of Q requires 5 hour of M, time and J hour of Mi time 
How many units of each product should be produced per day to keep 
each machine fully utilized'' 


11. Suppose that Commercial Airlines has available three types of air cargo 
foUowmg Uiad'cha'rt Carry '" 8 ,hrCC ' yPCS ° f eqU ' Pment aCC ° rdmg ‘° ^ 


Equipment 

Transport 

I II III 

A 

I 2 1 

B 

0 1 2 

C 

2 1 I 


actly 16 lvne P *| n |n typc 5 ^ ou ^ be dispatched to transport ex- 

planels X t^ST B ' 12 C machines 9 (Assume that each 

tt.omicfmodD WlnS mpul * cocfricll - nl matrix for a three-industry closed 

/°3 01 04 \ 

02 08 06 
\05 01 0 / 

that will ”t ud, * Jnum cxlsts is there a unique output mix 

non OS, m tenon 8 4 r W <"»' Use Equa- 


ble / linear phooramminq 



are said to have the same form if for each i = 1, 2, , n the corresponding 

coefficients a, and b, are either both zero or both nonzero Thus the functions 

2x + 6 y and 3x - 2 y 

have the same form, whereas 

3 x + y and 6x + 3 y — z 

do not A linear function L is , dent, fable if it has a unique form That is, it 
must not be possible to apply row operations R 1-R 3 to the linear functions 
other than L so as to produce a linear function having the same form as 
As a specific illustration, consider the system of four linear functions 
having coefficient matrix 



-2 
0 6 
0 

K 0 0 10 

We perform three row operations 

(i) Add -4 times the second row to the third row 

(li) Add -5 times the second row to the fourth row 

(ill) Add -1 times the new third row to the new fourth row 

The resulting matrix is 

1 -2 
0 6 
8 —24 


( 11 ) 



(12) 


in which the first and fourth rows have the same form Hence the first linear 
form is not identifiable 

A matrix which is obtained from an identity nln by ** ° f 

the row operations R 1-R 3 is called an elementary row matrix 

Example 2 (a) The elementary row matrix 

: :) 

IS obtained from the 3 X 3 identity matrix by adding twice the first row lo 

92 ELEMENTARY OPERATIONS / 221 



At each stage we have performed one of the following elementary row 
operations on the matrix in question In fact, operation R 3 below was not 
performed, but might be used, for instance, if the coefficient of Xi in the 
first equation were zero 

(R 1) Multiplication of a row (vector) by a nonzero real number 

(R2) Replacement of one row by the sum of that row and a constant 
times another row 

(R 3) Interchange of two rows 


Example 1 Mathematical models of behavioral systems* frequently take 
the form of a set of linear equations The problem facing the experimenter 
who wishes to use such a model is to estimate the elements of the matrix of 
coefficients of the equations on the basis of observed empirical results 
To illustrate the difficulties which may arise, consider a simple economic 
model with demand function 


D = a + bP + cY 

and supply function 

S= d + eP 

Here a, b, c, d, and e are constants, P denotes price, and Y denotes income 
we mu tip y the demand function by a constant a (row operation R 1) 


«£> = aa + abp + acY 


(10) 


cnmb d f/ tlme r n e SU , Pply functl0n (operation R 2) to (10) to obtain a linear 
combination of D and S of the form 

aD + (3S = (an + 3 d) + (ab + 3e)/> + acY 

cmnm^r P ! eSS> h nS h t V ‘" 8 ‘ he same general form as D itself Thus the 
must be esulL b , and c in D are not known but 

any mear comhm'i ! “"“r dl5t,n S ulsh the true demand function D from 
and 5 whlch he mi8ht observs funct,on 

wa T y h fte U two luZT' °" tha ° ther hand ' 15 ■» there is no 

Sout Zebv oli C “ n C ° mb,nCd USIn B =>™=ntary operation R2 

;*r,=rr - - - 

Two linear functrons ntlfiablllty ** formi,lated more precisely as follows 
0 >*i + a 2 x 2 + + GnXyi 

hi*i + &2X 2 + + 6„x„ 

CMcGraw Hitt tteok^ornpany.In^^t^^ypri^^^gj Methodology m Social Research 

220 / LINEAR PROGRAMMING 



obtained by adding c times the kt h row of / to the rth row and A is any 
n X m matrix. The ij element in EA is 

(EA)„ = £ 

*=1 

obtained by multiplying the ith row of £ by tbe/th column of X. If i ia not 
equal to r the only nonzero element in the ith row of £ is - 1 m the 
ith position. The element e„ multiplies a,„ so that in this case we have 

(EA)„ = 1 ■“„ = 

In all rows but the rth, then, the elements of EA are equal to the correspond- 

In the rth row^of £, we find c in the Mb position, 1 in the rth position, and 
zeros everywhere else. Thus 


C EA) r j = e Tk a kJ + e rr a ri 
= ca kj + Qrj 


■ c • a kj + 1 ‘ a rj 


The elements in the rth row of EA are 

the rth row of A and adding to those c _ 

from the icth row of A. This completes the proof. 

, x r. -i matrix there is an analogous statement 
For each statement about rows of a m > . onerations 

about columns. Thus, there are three elementary column operations 

(C 1) Multiplication of a column by a nonzero real number. 

_ i sum of that column and a constant 

(C 2) Replacement of a column by the sum 

times another column. 

(C 3) Interchange of two columns. 

• •«. from an identity matrix by per- 

An elementary column maln ** f of the following theorem is 

forming a single column operation. P 
left as an exercise. 


Theorem 2 If £„ x » is an <d ementaI T ^j^Tonoration, and if A is any 
identity by performing an e ><= mc "^ ^obtained by performing that same 
m X n matrix, then AE is the matrix ^ 

column operation on A. 

Note that row operations are performed by 
elementary row matrix, while column °P cr f °" “ rC pCC ‘° 
tiplying on the right by an elementary co um 

9.2 ElEMENTARY OPERATIONS / 221 


column matrix obtained from the 



the third row If B is any 3X2 matrix, the product 
/l 0 fK An biA / bn 

E 5 =[o 1 0 |[ 6 2 i *22 1= 621 


\2 0 l/y>3i b^J \26ii + 631 2612 + ^32/ 

may be obtained from B by adding twice the first row to the third row 
(b) If E is the elementary row operation matrix obtained from the 3X3 
identity by interchanging the first and third rows, and if B is any 3X3 
matrix, then 

/0 0 l\ A 1 1 bi 2 b \\ /&31 632 & 33 \ 

EB = I 0 1 0 j f 1 622 ^23 j = f ^21 622 623] 

v 0 0 /\p 3 i 632 633/ \bn bi2 b\%J 

The product EB is just B with first and third rows interchanged ^ 


1 Work out the products in Example 2 

2 Let £ = diag (3, 1, 1, 1) be the matrix which you obtain from the 4 X 4 
identity by multiplying the first row by 3 Compare the product EB, 
where fi is 4 X 5 with B itself 

reSt* FeSUltS ° f Example 2 and Problem 2 suggest the following general 


hv ner^” 1 ^ Is an elem entary row matrix obtained from the identity 

then FA Tof 0,16 eIeraentar y row operation, and if A is any n X m matrix, 
the matrix A ^ ^ K °^ taine ^ by performing that same row operation on 

other L“ 8lVC the pr °° f ° nly for row operation R 2, leaving the 
is the matrix ^ CXerciSes Consi der the matrix product EA, where E nX " 


kth column 1 


n y ^th column 
0 0 (\ 


\ 0 0 o 

222 / LINEAR PROGRAMMING 


1 0 f rth row 



is called the reciprocal or the multiplicative inverse of a. It has the property 
that a~'a = aa _1 = 1. By analogy, an n X n matrix B is called an inverse 
of the n X n matrix A if 

BA = AB = I < 13 > 

A matrix which has an inverse is said to be nonsingular. Other square matrices 

are called singular * .. . - 

A matrix, if it has an inverse at all, can have only one. For if B and C are 
both inverses of A, that is, if BA = AB = / and CA = A C = I, then 

C = Cl = C(AB ) = (CA)B = IB = B 

Thus, we speak of the inverse of A, which we denote . b £ A ' . ;f „ • - h 
Note that the definition (13) is symmetric in A and B. That is, if B the 
inverse of A, then A is also the inverse of B. Symbolically, 

C A -*)-* = A 


Example 1 To find the inverse of 


■GI) 


we look for a matrix 


-CD 

with the property that AB = I. Writing out the matrix product gives 

(\ 2 \(a b\ = ( a + 2c b + 2d ) = ( l °) 

AB \3 4) \c d) \3o + 4c 3 b + Ad) \0 1/ 

Equating corresponding coefficients, wc have four linear cqua 
a + 2c = 1 b + 2d = 0 

3a + 4c = 0 3i> + 4d = 1 

whose solution is a = -2, 6 = I, c = 2- jnJ d - - )■ f,cn<x 

-(I -D 


D 


problems 

I. Verify in Example I "* ^ * 

Examplo 2 Since / / - /. the identity matrix is itsowii invcrs-. ► 

•Al-Oi „ „ poiublc ,o »»l U.C eonccp. of to ^ 

r*oi Jo io in ihit ten. 

93 THE IMVE" s£ ° f A MATBIX 1 



PROBLEMS 


. Find the matrix product AE, where A is any 4X3 matrix and E is the 
elementary column matrix 

(a)/l 2 0\ (b) 


'1 2 O' 
0 1 0 
,0 0 1 


0 0 
0 1 
1 0 


4. Write elementary row matrices which will produce the successive arrays 
in (9) Verify your answer by multiplying these by the initial array 
{Hint Each new array requires two or three row operations ) 

5. In Example 1, find the elementary row matrices required to transform 
the initial coefficient matrix (11) to the final coefficient matrix (12) 

6. Prove that the first row of the matrix (11) is not identifiable by using 
column instead of row operations {Hint Operate on the transpose of 
the coefficient matrix ) 

In Problems 7-10, determine whether the first row is identifiable (see Ex- 

ample 1) 



9,3 THE averse of a matrix 

l f a* Onto real number, the number 

Urr- 

a 

7 L1NEA " programming 



whose inverse is 


i 1 ; vjy 

\T5 TS “15/ 

With A known, it is a simple matter to solve the linear equations (15) 
We multiply both sides of AX = C by A 1 to obtain 
A-'(AX) = A~'C 

Since A - 1 A = / and IX = X, this reduces to 

A~ X C 

Performing the indicated matrix multiplication yields 

f x '\ (l 

x 2 1 = A-'c = I A 

\xj \A A - 

from which we read JCj = 1, *2 = 0, and x 3 
computations 



verifying our previous 

► 


PROBLEMS 

4 Verify in Example 4 that A 1 A ~ I 

a, « —i — ,«■» “ " £ 

matrix equation /IX = C is solved to give A , . f Sectl0n 9 1 

erable saving in time and effort unfortunately, there 

However, the gain is more apparent than Moreover, the method of 
remains the problem of finding the inverse / 

Example 4 obviously fads if A does not have an un» n ^ f US v “‘ Z 
tinue to solve linear equations by successive elimination of variables (row 

operations) . , 1S ltse if best approached 

The problem of finding the inverse _of t0 rocee d, supp ose that 

through elementary row operations To s the (( x „ matr ,x A 

by performing row operations we are abl elcm entary row matrices 

mto the identity matrix I This means that there ar 
Hi, Sj, , R t such that 

i?,K,_i R2R1A “l 

By definition, then, the product 

R = R,R,- 1 RtRi 

IS Ihc in%ersc of A since RA = / But 

A~' = R= Rl = R '* 

and this, together with (16), provides the following scheme 

93 THE INVERSE OF a MATRIX / U7 


(17) 

( 18 ) 



PROBLEMS 


2 Find all 2 X 2 matrices A which are their own inverses ( Hint Solve 
the equation^ A — I) 


Example 3 The matrix 


has no inverse For if 


■U 4 ) 
-(: :) 


is to be the inverse of A, we must have 

/ 1 b\_( a + 2 c 6 + 2d\ /I 0\ 

\-3 -6/ \c d) \— 3a - 6c -36 - 6 d) ~ \0 1/ 

Equating corresponding elements gives the four equations 
a + 2c=l b + 2d — 0 
-3a - 6c = 0 —36 — 6rf = 1 

i“r° n ” 3a - 6c = "X" + 2c) = 0 requires a + 2c 
both so p FS * C£ * Uir /f a + — 1 Obviously, no choices for a and c ci 

both, so Equations (14) have no solution and A has no inverse 


(14) 


mMncesfor wh,eh PlC Sh0WS that ' unl,kc reaI nurabers . lhere are nonffiro 
matrices lor which no inverse exists 

PROBLEMS 

3 Prove that a zero matrix cannot have an inverse 

Example 4 We saw m Example 1, Section 9 1, that the equations 
*t - 2 x 2 + 3 x 3 = 4 

-*i + x 2 + x 3 = 0 (15) 

2*i + 2*, - * 3 = [ 

ha\c the solution .v =1 V., 11 1 , 

sr - ,h = -« " - ttxzzzixrjsz 

/.-2 3' 

= (-i i i 

\ 2 2-1 

/ tINEAn PROGRAMMING 



PROBLEMS 

5 In Example 5, verify that AA ~ 1 = A~'A = I 

If the reduction (16) is possible, then A has the inverse indicated m (17) 
Equivalently, if A has no inverse (is singular), then it will not be possible to 
change A into an identity matrix We state without proof that this is always 
indicated by the appearance of a zero row at some stage in the procedure 


Example 6 If we attempt to find the inverse of 



we are led to row operations indicated in Table 2 Since it is not possible to 
transform the third column of A into a vector having one m , 

and zeros elsewhere without altering the desired values m the first and second 
columns, we conclude that A has no inverse 



problems n 

In Problems 6-11 lint try to find the ^^wih^sy^'n'oreqwfions If 
Ran inverse exists, use it to ^ m thL usual way by successive 

no inverse exists try to sohu thu tqua 


the INVERSE Of A 


matrix / m 



Theorem 3 In order to find the inverse of a nonsingular matrix A, we 
perform row operations on A and also on the identity matrix / The same 
operations which change A into / (Equation (16)) will transform / into A" 1 
(Equation (18)) ► 


TABLE 1 Steps in Obtaining the Inverse Matrix 


Row 

A 

/ 

*10 

*20 

*30 

3 2 1 

2 3 2 

4 1 1 

1 0 0 

0 1 0 

0 0 1 

*11 ~ 3*10 
*21 " *70 - 3*10 
*3\ =* *30 — 3*10 

1 2 1 

1 3 3 

0 3 S 

0 -1 -l 

\ 0 0 

-1 1 0 

-I 0 1 

*M - *,1 ~ 1*2, 
*32 ~ 5*2, 

*32 ” * 2 1 + * 3I 

1 0 _J 

o i j 

0 0 1 

1 -? 0 

-1 1 0 

-2 1 1 

*13 “ *,2 + 5*32 
*23 = *22 - 5*32 
*33 ~ *32 

1 0 o 

0 1 0 

o 0 1 

l -i l 

S -1 

1 1 


Example 5 Table 


I shows steps in the computation of the inverse of 

C 2 

A 2 3 2 1 
A 1 1 


changtf A m^rand"/ Ini' 4 J dentlly matr « I The row operations which 
denotes he 1, V are md,ca,ed “ ««= margm The symbol 
soHm 8 l,nea* aua,,r “ U,e ? h S,agc Note that strategy here, as tn 
form the columns oM mto coTumns°rf?h“° T Wh ‘ Ch W ‘ U SUCCKSIVC ' y ^ 
From Table 1 we read f H ldent ' ty matm 


i-i i 
§ -i -i 
1 1 


223 / tltlEAS PROGRAM 


► 










9 4 LINEAR INEQUALITIES AND CONVEX SETS 

In this section we propose to study linear inequalities That is, we shall be 
looking for solutions of systems such as 

UllXl + 012*2 + + Q i" x " - Cl 

021*1 + 022*2 + + a 2 "*» - C2 (19) 

O m 1 X 1 + 0,„2*2 + + — Cm 

This system is identical in form to the system (1), except that inequalities 
have replaced equalities 
In matrix notation, (19) may be written as 

AX < C 

where, as usual, an inequality between matrices means ^ 
applies to each pair of corresponding elements in the two matt ces It is 
assumed that the numbers a„ and c, are known Our problem is to find all 
solutions, that is, all vectors X such that AX _ 


( 20 ) 


Example 1 Consider the system 

2* + 3 y< 6 

* - y ^ 0 

2* - y S - 1 

„ thntp in (19) but this is easily rem- 
The last inequality is opposite in sense to th t ) 

edied by multiplying through by - 1 to obtain e sy 
2* + 3y < 6 
* - y < 0 
- 2 * + y < 1 

. . . 7 V + 3 V = 6 is the straight 

Graphically, the set of points (x,y) for 2* + 3y < 6 lie in the 

line shown in Figure 1 Those points w * hlch satisfy the first in- 

shaded region below this line Thus the p 

2* + 3y<6 

he on or below the line Such a region is an/wc find that the 

Similar comments apply to the other \v ^ mtcr5caion 0 f three closed 
set or solutions (*, y ) of the inequality (- ) > shjdcd tnangki mcluding the 
half planes In Figure 2 this intersection 

boundaries 


94 


linear INEQUALITIES 


/ 231 



elimination of variables Note that while existence of an inverse guarantees 
a solution, failure to have an inverse is entirely inconclusive 


6. 

x\ + 3x2 = 4 



7. 

* 

- 

y 


4 





2*i — 3*2 — 6 




2* 

+ 

3 y 

= 

10 




8. 

3*i + 2*2 — 6*3 

= 

14 

9. 

* 

+ 

2 y 

- 

3z 


10 



— *1 + 4*2 + 5*3 


-3 


3* 

- 

y 

+ 

z 

= 

20 



6*i — 10<2 — 21*3 

= 

10 


* 

+ 

y 

+ 

z 

- 

5 


10. 

-4* — 4> + 14z =* : 

10 


11. 

* 

+ 

2y 

+ 

2z 

_ 

w — 

19 


* + 2> + 3 z » 

1 



2* 

+ 

4y 

- 

z 

+ 

8 w « 

8 


-4t — 6y -j- 2 =. 

3 



* 

+ 

2y 

- 

5z 

+ 

13»t» 

-23 






3* 

+ 

6 y 

+ 

4z 

+ 

w — 

45 


12. (a) Solve Problem 13 of Section 9 1 by first computing the inverse of the 

matrix / — A 

(b) Explain why the ij element of (/ — A)- 1 can be interpreted as the 
output of industry j which is required to provide one unit of indus- 
try/ product to the nonproductive sector 

13. Suppose the demand vector D = (10, 25, 20) in Problem 13, Section 9 1 
can be written as the sum 


D = C + V+G 

t'’ a ^ d G denote v «tors of personal consumption, private 
investment, and government demand, respectively, for the three sectors 

00 and r b p‘ h | : mMm (C> K C) whose are vectors C. V, 

meted l P ,r Why V he (2 ' 3 > ^ment m tf - A)~‘F can be mter- 
demand C manulaclunn B output which satisfies government 

<b) to h / 'th HC ° f ‘ he el ' menls m the tlh row of (I - A)-'F equal 

to a:, the output of the tlh sector’ 

!//uit an r fl are n ° nsl " 8 “ lar matrices, show that ( AD 
<""" Compute (AB)(B~ , A~ I ) and (B~ I A~ I )AB) 

15. Prove that if a, a* i „ . 

At At i ,c * *’ ’ Ak are n °nsingular matrices, then the product 

tmcrk- Oft™ " 0,lsm8 “ lar has the inverse AT'ATl, AT‘ The 
order ' P d “ ct ls thc product of the individual inverses m reurse 


16. What is the elfccl on ,|-t ,f„ g 

(a) interchange two columns of ,f 

(b) multiply one row by a constant’ 

(c) multiply one column by a constant’ 


2M / LINEAR 


programming 




and planes in space are convex The kidney-shaped set in Figure 4 is not 
convex since the point Z between X and Y is not m t e se t 

Our reason for introducing the concept of convexi y 1n , t! , nce the 

of linear inequalities always produce convex solution se s ( the 'fact 

solution set m Figure 2 is obviously convex ) A proof begi 
that if P and Q are two points in the plane, then any point Z lying on 
line which passes through P and Q can be written in t e orm 

Z = (1 - ()/> + tQ (21) 

where 1 is a real number If t = 0, for instance, Equation (21) becomes 
7" , IVif r =7z = 2 (see Figure 5) Values of tgreater^th an one 

give points beyond Q, while r < 0 P r0< J“ ces a ^’"articular mte rest is the 
tween 0 and 1 yield points between P and 2 ut pe™ u 

POmt A _r+Q = iP + i Q 

lying halfway between P and 2 This point is called the average of P and Q 
and is obtained by putting t = i * n Equation ( ) 

FIGURE 4 A set which is not convex 


94 LINEAR INEQUALITIES / 233 



FIGURE 1 



The points (3/8, 7/4), (6/5, 6/5), and (-1, -1) are called corner points, 
or extreme points, of the triangular solution set These corner points are mter- 
ZT?. boundary lines Thus they may be found by changing the 

qua i ies in (20) to equalities and solving these equations in pairs ► 


betray nnH v d , C T eX ,f Whenever points x and y be m C, all points Z 
m the nlini* 3 16 m ^ ^ See ^ 1 8 ure 3) Typical examples of convex sets 

P are circles, triangles, rectangles, and lines Spheres, cubes, lines, 


figure 2 



232 / LINE A 


R programming 


If each employee is assumed to work 8 hours per day, there are 36 ■ 8 288 

assembler man-hours and 18 • 8 « 144 packer man-hours available per day. 
Let a and b denote the respective daily productions of components A and B. 


Obviously, 


a > 0 and b > 0 


(23) 


since we cannot have negative production. In addition, the capacity restric- 
tions for assemblers and packers require that 


a + 2b < 288 and a + \b < 144 (24) 

The convex set of points (a, b ) that satisfy the four inequalities (23) and 
(24) is shaded in Figure 6. Note that of the 



points obtained by changing the four inequalities to equa l le 
in pairs, only four are actually corner points of the so u ion se . . 

which naturally arises is this: Subject to the given res ric 10 ' . 

a and b be chosen so as to maximize profit? We shall answer this question 

in the next section. 


FIGURE 6 




FIGURE 5 



where t is a number lying between 0 and 1 
Suppose now that X and Y are solutions of the system of linear inequalities 

AX<r 7“ Zte ' ny pomt be tween X and Y That is, suppose that 
AX < C, A Y < C, and that Z satisfies Equation (22) Then 

AZ = /l((l _ ,)X+,Y] - (1 - ,) AX + IAY 

< (l - r)C + ic = c 


and it follows that Z 
two solutions X and 
set is convex 


is also a solution In 
y is also a solution 


short, any vector Z lying between 
By definition, then, the solution 


18 packers 2 P * ant ^ as 54 em Ployees — 36 assemblers and 
which requires the i"h pr °r“ ce ? two component packages A and B, each of 

ments reu unit are ^ ° f l” lh asscmbIers and packers The labor require- 
“ pcr umt “re presented in Table 3 


TABLE 3 


Labor 

Man hours 

required per unit 

classification 

Components 

Component B 

Assembly 

1 


Packing 

1 

i 


234 / LINEAR PROGRAMMING 



Let v denote the number of A type castings and y the number of B type 
castings to be produced Draw the convex set of possible (x, y) pairs 

13. Let the input matrix for a simple two sector Lcontief model be 

-CD 

(a) Graph the set of viable solutions to the model 

(b) Does a non zero interior equilibrium exist’’ 

14. Suppose that the input matrix in Problem 13 is 

c 

What is the set of viable solutions’ 

15. Given an input matrix 

A 


C 

s’ 

(" 

\a2i 


D 


ai2\ 
0 ) 


what are the restrictions on ma and o 2 i m “der to ensure a non- 
zero viable solution ’ 

16. Show that the intersection of convex sets is a convex set 

17. Draw a picture to show that the union of two convex sets need not be a 


9 5 LINEAR PROGRAMMING 

A fundamental problem in economics is to allocate hm.ted resources so as 
to best meet desired goals One of the most important tools for attacking 
such problems is linear programming 

Cv „ mn i, 9 0 f the Drevious section that components 
Example 1 Suppose in Example 2 of the P Q() tlvely The plant 

A and B y‘ eld P er unlt P rodts ° f , f* ° f gating available labor so as to 
management is faced with the problem oi all & 

achieve the objective of maximum daily profit , ^ ^ ^ restnctl0ns 

In the notation of Example 2, this mean » nrofit function 

(23) and (24), a and b are to be chosen so as to maximize the profit function 

P = (1 50)o + (1 00)6 = ¥ + b {25) 

Let us look at the situation geometrically as “ j^i'ch Mt isfy Equation (25) is 
fixed at, say, P = 75, the set of points (a 6) « „ yldd l oth er 

a straight line (see Figure 7(a)) Choosing and p = [50 are shown 

lines, all of which are parallel (the lines for decrease d, the line rcpre- 

m Figure 7(a)) It is apparent that when p lhc ong , n On 

anting the graph of the profit function will move 

95 LINEAR PROGRAMMING / 237 



PROBLEMS 


la Problems 1-6, draw the convex solution set for the given system of linear 
inequalities 


1. x > 0 
> >0 
x + 3y < 9 
3. y > 0 
x> 0 
x < 4 

5. x + 3y > 0 
x - 3y < 0 

y < l 


2. x < 1 

y< 2 

3at + 2y > 0 
4. -5a- + y < 0 
-2a + y > 0 

6. 2x + 4y > 8 
4x + 2y > 8 
x > 0 
y > 0 


In Problems 7-10, draw the solution set and find the extreme points 
7. 3* + y > 6 8. ~4x + y < 1 

* - 0 -x + y>2 

yi0 x + y<4 

9. 2x + y> -1 10 . ~ ix + y < 3 

;<r 54 «+;*•" 

y > 0 

U ' l', 0 ' 1 '? 1 Cand,date has deeded to purchase at minutes of local radio 
' a' " /«T UtK 0f l0Ca! Tv ,lme The rates are $50 per minute for 
radio and $200 per minute for TV lime 

<a) szzrr* ° r rad, ° and ^ ,ime cannot be b ° uBht Bpress 

CS riC IOnS ^ lnequa ^ lties an d draw the solution set 
oVriH?«? ed tha ! not Iess than one minute of TV time and \ minute 
° ,me Wl * ^ b° u 8ht Modify the solution set of (a) 

MS!” 1 T* f ° r radl ° and TV advertising cannot exceed $1000 
Modify the solution set in (b) 

^ itt!! decided { bat at least twice as much should be spent on radio 
ising as on TV advertising Modify the solution set of (c) 

’ rca^«™ Sh ° P produccs tw » rasting types. A and B Each casting 
required ™ pc 1 ral ‘°" °n each or three ditferent machines The times 
are enpn , n *• ^ acllInc a °d the available free time for each machine 
S1%en m me following table 

Time required to produce 

one casting of type Available 

_ ^ B monthly time 


Machine / 
Machine II 
Machine III 


4 

n 

4 


236 / IINEA 


110 

100 

150 


R programming 



gether with m restrictions on these variables in the form of linear inequalities 


anxi + <112X2 + + a i" x " - Cl 

< 121 X 1 + < 122 X 2 + + a 2n x " — c 2 


( 26 ) 


Om\X\ + 2X2 + 


+ UmnXn < Cm 


(In Example 1, there are two variables a and b 
posed by Equations (24) ) In addition, there is 
(23)) that each variable be non-negative 


on which restrictions are im- 
a condition (as in Equation 


X! > 0 


x 2 > 0 


(27) 


x„ > 0 

As we saw in Section 9 4, the inequalities (26) and (27) 
set F, called the set of feeble vectors The 1 problem 1 wh , ch ’ niaxl . 

among all feasible solutions X — col (*i» * 2 . > ) 

mize a linear function 


g{X) = b,Xi + * 2 X 2 + + 


(28) 


(like the profit function (25)) Given 

The problem may be stated more su “ ,nc y USl g , to[ B find those 
an m X 11 matrix A, an m X 1 vector C, and an » X 1 vein 
11 X 1 vectors X which maximize the function 


gW 


. B'X 


(28') 


(26') 

(27') 


subject to the restrictions 

AX< C 
X>0 

The key to solving this problem is the following theorem 
Theorem 4 Let F be the set of 

nung problem determined by Equations ( H which g is mini- 

Then, among all points in F, there is a corner point 01 c 
nuzcd and another corner point at which g is ma 

PROOF For simplicity, we mdicatc the pmof °”^° ons (polnts m f) and 
case is similar Let IF and Y be any two fcasibl 

* x=(i _0,f + -t (0<i<» 


95 UNEAR PROGBA 


M M 1 N G / 233 



FIGURE 7 



Max'lmllm'l!" 1 '!;’. Wh T ptofit 18 mcre ased, the line will move further away 
while at «,—!** obtamed b v moving the line as far to the right as possible 
and (241 LTe nterSeCtI " g ,he solutlon ^for the inequalities (23) 

( ) figure 7(b) shows that these twin objectives are realized by the line 


daily 1 profits t $240 acV ""l? the Corner pomt P 6 . 96 > Thus > maximum 

and li ’ ICVe by P r °ducing 96 units each of components A 


■ uuutcivia 

1 ^ 3 l ° Show l hat the opt, mum solut.on » 

man hours and 96 mri, ^ ab c !a ^ or resources, allocating 96 assembler 

1 at each of the other corner 

Example 1 contains all the cssentni a * . tn< , 

problem the general problem 


238 / LINEAR 


PROGRAMMING 



job A and $9000 in job B. How many people should the interviewer hire for 
each job in order to maximize the total average return to the company? 

To solve this problem, let x denote the number to be hired for job A and 
y the number for job B. We expect on the average that (0.80)x and (0.90)y 
will be successful while (0.20)* and (0.1 0)y will be unsuccessful. The total 
average return is 

R(x, y) = (10,000)(0.8x) - (4000)(0.2x) + (1 0,000) (0.9y) - (9000)(0.1y) 
= 7200* + 8100y 

The problem is to maximize R subject to the restrictions 
x>2 x <4 

y > 4 x + y < 10 

The feasible region is shown in Figure 9. The respective values of R at the 
corner points are 

R(2, 8) = 7200(2) + 8100(8) = $79,200 
R(4, 6) = 7200(4) + 8100(6) = $77,400 
i^(2, 4) = 7200(2) + 8100(4) = $46,800 
R(4,4) = 7200(4) + 8100(4) = $61,200 

The optimal solution is to hire two persons for job A and eight for job B, 
yielding a total average return of $79,200. ► 

FIGURE 9 


y 



9.5 LINEAR PROGRAMMING / 241 





be a pomt m F wh.ch Ues between W and Y (see F.gure 8(a)) Suppose that 
a - B'W is less than b - B'Y Then 

B'X = B'[(l - 1W + m =■ O - »B'W + tB'Y 


= (l — t)a + tb = a + t(b — a) 


FIGURE 8 



(a) 


(b) 


Since b — a is positive and / is between 0 and 1, it follows that 


or that 


a < a + t(b — a) < b 
B'W < B'X < B'Y 


The value g(AT) = B'X of the function g at a point X between W and Thes 
between the values of g at W and Y 

Now suppose that, among the corner points of F, the maximum value of g 
occurs at P and the minimum value occurs at Q (see Figure 8(b)) Let R be 
any pomt in the feasible set F Draw a line from P through R until it cuts the 
boundary of F at a point S lying between corner points V and T We know 
that (1) the values of g at T and V lie between the values at Q and P t by 
definition of Q and P, (2) the value of g at S lies between the values at T and 
V, and hence between the values at Q and P, and (3) the value of g at R 
lies between the values at P and at S and hence between the values at Q and 
at P Since this holds for any pomt R in F t we see that among points in the 
entire set F, g takes its largest value at P and its smallest value at Q ► 


Examplo 2 A college interviewer is hiring for job classifications A and B 
He has been instructed by his company to hire between two and four people 
for job A and at least four people for job B t but no more than 10 altogether 
From past experience, the interviewer knows that 80% of those hired for 
job A arc successful and 20% unsuccessful, while for job B 90% are suc- 
cessful and 10% unsuccessful A successful employee in cither job is valued 
at S 10,000, while an unsuccessful employee costs the company S4000 in 


240 / LINEAR PROGRAMMING 



job A and $9000 in job B. How many people should the interviewer hire for 
each job in order to maximize the total average return to the company . 

To solve this problem, let x denote the number to be hired for job f and 
y the number for job B. We expect on the average that (0.80)x and (0 90)y 
will be successful while (0.20)x and (0.10);- will be unsuccessful. The total 
average return, is 

R(x,y) = (10,000)(0.8x) - (4000)(0.2x) + (10,000)(0.9y) - (9000)(0.1y) 
= 7200x + 8100j> 

The problem is to maximize R subject to the restrictions 


y > 4 x + y < 1° 

The feasible region is shown in Figure 9. The respective values of R at the 
corner points are 

R(2, 8) = 7200(2) + 8100(8) = $79,200 
R(4,6) = 7200(4) + 8100(6) = $77,400 
R(2, 4) = 7200(2) + 8100(4) = $46,800 
*(4, 4) = 7200(4) + 8100(4) = $61,200 

The optimal solution is to hire two persons for job ^ and eight for job £ 
yielding a total average return of $79,200. 


figure 9 


y 




Example 3 In Example 2, suppose that a successful employee in job B is 
worth S9000 Then the total average return is 

R(x,y) = 10,000(0 8x) - 4000(0 2x) + 9000(0 9y) - 9000(0 1 y) 

= 7200x + 7200y = 7200(x + y) 

The graphic solution with the same restrictions on x and y is shown in 
Figure 10 Instead of a solution occurring at a single corner point as in 
Example 2, corner points (2, 8) and (4, 6) both produce the maximum 
average return 

= 7200(10) 

= $72,000 

It follows that every boundary point of F lying on the line between (2, 8) 
and (4, 6) also yields maximum return ^ 

FIGURE 10 


y 



Example 4 Production of a unit output in a certain job requires the com- 
pletion of three separate tasks A, B, and C Three workers are available for 
assignment to the job The efficiency of each worker in terms of the number 
or units produced per hour for each task is indicated in Table 4 Given that 
worker I costs S2 50 per hour, worker II S3 00 per hour, and worker III 


242 / LINEAR PROGRAMMING 



$5.00 per hour, what is the best utilization of each worker in order to pro- 
duce at least 1000 units with minimum labor cost? 


TABLE 4. Number of units produced per hour for each worker 



i 

II 

in 

Task A 

40 

120 

60 

Task B 

30 

50 

150 

Task C 

60 

80 

100 


We solve this problem by letting x, y, and z represent the number of hours 
production time allocated to workers I, II, and III, respectively. The linear 
programming problem is to minimize the cost function 

C « ($2.50)x + ($3.00)y + ($5.00)z 
subject to the linear constraints 


OAx 4- 1.20 y + 0.6z > 10 

x>0 

0.3* + 0.5j> + 1.5z > 10 

y > 0 

0.6x + 0.8y + l.Oz > 10 

z > 0 

The loci of the solution in three dimensional space is presented in Fig- 
ure 11. in the case of linear inequalities in three unknowns, each linear 


figure 11 



9 5 LINEAR PROGRAMMING / 2 *3 







The lines of intersection of the planes 


equality appears as a plane m space 
m the positive octant are 

first plane A with second plane B AB f BA 
first plane A with third plane C AC, CA 
second plane B with third plane C BC, CB 

where the endpoints of each line are the i ntersectio ns of the trac es of the 
respective planes The three lines AB, BA, AC, CA, and BC, CB intersect 
at the point S — (2 10, 5 40, 4 46) which satisfies the set of three equalities 
The boundary planes of the unbounded convex set of feasible solutions 
defined by the three inequality constraints are outlined by bold lines in 
Figure 1 1 The extreme points of the convex set are 

Ei = (0, 20, 0) E 5 = (0, 5 56, 5 55) 

E 2 = (33 33, 0, 0) E 6 = (0, 7 14, 4 29) 

£ 3 = (21 43, 0, 2 38) E 7 = (2 10, 5 40, 4 46) 

£4 = (0,0, 16 67) 

Computing the values of the cost function C = ($2 50)x + (S3 00)j> + 
($5 00)z at each extreme point, we obtain the following costs $60, $83 32, 
S65 48, $83 35, $44 43, $42 87, and $43 75, respectively Thus, the minimum 
labor cost is achieved by not employing worker I, by employing worker II 
for 7 14 hours, and worker III for 4 29 hours Allocation of labor according 
to this schedule will result in completion of approximately 1114 task A 
units, 1000 task B units, and 1000 task C units In other words, in order to 
achieve minimum labor cost for the completion of at least 1000 items, it is 
necessary to complete 114 more task A units than absolutely required ► 

PROBLEMS 

3 Find both the maximum and the minimum values which the stated linear 
function takes in the respective convex solution sets described m the 
given problems 

(a) G(x, y) = 2x + 6 y, Problems 1, 2, and 5 of Section 9 4 

(b) Hx, y) = -5x + 2y, Problems 3 and 4 of Section 9 4 

(c) H(x,y) = Ax + 8y - 12, Problems 8, 9, and 10 of Section 9 4 
In Problem 11 of Section 9 4, assume that the value criterion is 

V = SlOOOx + SlOOOy 

vvhcrc x is TV time and y ls radio time How much time should the can 
didau. purchase in order to maximize his value *> 

5 Suppose that the value criterion in Problem 4 is V = 5500.x + S2500> 
How will this affect the amount or TV and radio time purchased-* 

244 / LINEAR PROGRAMMING 



6. In Problem 12, Section 9 4 assume that the manufacturer makes a profit 
of $8 per umt for type-/! castings and $12 per unit for type~2? castings 

(a) How many of each type should be produced in order to maximize 
his profit 9 

(b) Which inequality is superfluous in that the feasible set is the same 
with and without the inequality restriction 9 

7. Suppose that the profit on type-A castings in Problem 6 is increased to 
$12 per unit How many units of each casting should now be produced 
in order to maximize profits 9 

8. An oil refinery produces* 2000 gallons of pitch per day A certain 
amount, P gallons, of this is to be blended with flux stock, of which there 
is an unlimited supply available, to make commercial fuel oil The rest, 
2000 — P gallons, is sent to a visbreaker unit which converts each gallon 
of pitch into 0 8 gallon of tar (see Figure 12) The tar may also be 
blended with flux stock to make fuel oil 

Blending requirements impose the constraints 

5P + IIP + 37 F > 21 (P + r+f) 

8P + 7 T+ 24 F > 12(P + T + F) 

where Pis the amount of tar produced and F is the amount of flux stock 
used Of course, T - 0 8(2000 — P) Assuming that fuel oil brings a 
net return of 80 per gallon, and flux stock has a cracking value of 100 
per gallon, find the blend of flux stock, pitch, and tar which will maxi- 
mize the profit function 


R = 8(P + T 4- F) - 10F 


FIGURE 12 

(daily production) 


(blended with flux stock) 



(fuel oil produced) 


The following problems involve three variables In each case, draw a 
three dimensional picture showing the feasible region jnd the corner 
points 

9. A cotton farmer has been persuaded by his county agent to consider 
dntrstiiculton as a means of menacing income The loeal bunker is uill- 

"Adaptcd from Sy monels G H Unrar Pn>\ruw> me Tht SolutuMt of PiuMrms 

Standard Oil Company. New Vo/i. 1953k 

9 5 LINEAR PHO GRAM MIN Q / 245 





ing to lend the farmer up to $6000 for the construction of facilities to 
handle livestock The farmer has 80 acres of land and 3000 man-hours 
available per year to devote to farming 
The resource requirements for the various alternatives are as follows 

Cotton Dairy Chickens 

(1 acre) (1 cow) (10 0 hens) 

Labor (man hours) 100 100 125 

Land (acres) 12 0 

Construction ($) 0 300 400 

If the net unit returns are $400 per cow, $250 per 100 hens, and $200 
per acre of cotton, how should the farmer allocate his resources m order 
to maximize his net return 

10. Rework Problem 9 with the net returns equal to $300 per cow, $400 per 
100 hens, and $150 per acre of cotton 

IL A company has on hand 500 lb of peanuts, 150 lb of cashew nuts, and 
75 lb of Brazil nuts It packages and sells in eight ounce cans three basic 
mixes of nuts Type A, consisting of 50% peanuts, 40% cashews, and 
10% Brazil nuts, sells for 390 per can, type B consists of cashew nuts 
only and sells for 800 per can, and type C consists of 40% cashews and 
60% Brazil nuts and sells for 650 per can What production schedule 
will provide maximum returns to the company *> 


SUPPLEMENTARY READING 


^FnpWn^ri'ni 0 £ eg * Mal,temaUcs with Business Applications (Prentice-Hall, Inc , 
Englewood Cliffs, N J , 1969), Chapters 7, 8, and 10 

and Applications (McGraw-Hill Book 

Company, Inc , New York, 1964), 2nd ed 


Hohn, F E , Elementary Mams Algebra (Macm.Uan Company, New York, 1964) 


248 / LINEAR PROGRAMMING 



CHARACTERISTIC EQUATIONS- 
QUADRATIC FORMS 


101 DETERMINANTS 


If A is a square matrix, the determinant of A, written is a number 

computed by combining the elements of /) in a cer am way 
mon notation for the determinant of A is \A\ ) or a 


the determinant is defined by 




(D 


For instance. 


det 


^ | ^ = (l)(3)-(2)(-') = 3 + 2 = 5 


The definition (1) is designed to mirror certain "%*“%£** an5C 
in the solution of linear equations Consider the pair q 

ox + by =■ ^ 

cx + dy - "* 


101 OETE 


RMINANTS / 247 



If we multiply both sides of the first equation by c and both sides of the second 
by a, we obtain 

(flc)x + (bc)y = kc 

(ac)x + (pd)y — am 

Subtracting the first of these equations from the second in order to eliminate 
x, and solving the resulting equation for y gives 

v = am ~ kc (2) 

y ad — be 

Similar computations yield 

= kd - bm (3) 

ad — be 

A glance at the definition (1) shows that these solutions for x and y may 
be written as ratios of determinants 


detl 


d. 


% 


det 


C 5) 


*(: 9 


(4) 


det ( 


Equations (4) are called Cramer's Rule* for solving linear equations 

e now wish to extend the definition of determinant from 2X2 matrices 
( S( l uare matrices (determinant is a function whose domain is the 
demote " iatnces ) Rather than continuing to work with individual 

mentarv Z “ 0) ’ WC ad ° pt the strate 8y of describing the effect which ele- 
nants The nr 0pe * atl0n * on matrices have on the corresponding determi- 
c properties of the determinant function det are given below 

orTbTlL’co^Sm, fhe S „ 0bta ' ned fr ° m ma ‘ nX A by mult, P lylns ° ne r ° W 


det B = m det A 

constant'" 3 " by 3 consta "* tnulttplies the determinant by that same 

^toTc/tlfrow, fhcn° b,a,nCd r[ ° m A by addln S a mult ‘P le of the ,th r0W ° f 
det B = det A 

d r =, r e r Zan. r ° W 0PCra “°" * 2 ° n “ “ces not alter the value of tto 
•After the Sam mathematician Gabrrcl Cramer (170M752) 

/ CHARACTERISTIC EQUATIONS 



Property 3 If B is obtained from A by interchanging two rows, then 
det B ~ —dttA 

Interchanging rows changes the sign of the determinant Im- 

properly 4 The determinant of an identity matrix is 1 

det I = 1 |m- 

Computation of determinants is accomplished by performing row opera- 
tions on the matrices in question 


Example 1 If A is the matrix 


then 

det .4 = 


= —det 


= —2 det 




— 2 det 


= -2 det / = 


[Property 3] 


[Property 2 
applied twice] 


[Property 1 
then 

Property 2] 


[Property 4] ► 


Theorem 1 If one row of a square matrix A consists entirely of zeros, then 
d=t A = o 

PROOF Suppose the ith row of A contains only zeros and let B be the 
matrix obtained from A by multiplying tins ith row by the constant 0 Of 
course, B = A so det B = <lclA On the other hand. Property 1 says that 

det B = 0 dclA = 0 

11 follows that det A = 0 ► 


101 DETERMINANTS / 240 



Example 2 


det 



— det 


0 

1 

2 


[Property 3] 


/' ° 

= —det I 0 1 

\0 2 


[Property 2] 


(' 0 

= -det I 0 1 

\0 0 


[Property 2] 


[Theorem 1] 


► 


Since a square matrix can always be reduced by row operations either to 
an i entity matrix or to a matrix containing a zero row, it is clear that Prop- 
er les -4 do indeed determine a function which assigns to each square 
matrix a unique number This number is nonzero if and only if the matrix 
o C ? UCed !° atl ,dentlty matr,x Coupling this with Theorem 3 m 
n 9 3, we obtain the following important result 

onlv^ntTff A square matnx is nonsingular (that is, has an inverse) if and 
if and onlv ,r C [ m ,l n ? nt lS nonzero Equivalently, a square matrix is singular 

»t and only if lts determinant is zero ► 


inverse 5 ° 3 ^ The matnx A m Example 1 is nonsingular and has the 

""Gr:) 

(b) The nialnx m Example 2 has no mverse smee its determ, nant is zero 


A sjstcm of linear 


equations 


AX = O 


(5) 


always has at least one solution, 
singular, then this is the only soil 
sides of (5) by A~ l . 

2M / CHARACTERISTIC EQUATION 


sinnular then thi« n \ I namc ^ x — O If is square and non- 
sides of (5) by A~ l . ° * S0 ut,on * as ma y he seen by multiplying both 



On the other hand, if A is singular, then row operations performed on A 
to solve the equations will produce a zero row This row may then be elim- 
inated (as in Example 2 of Section 9 1) leaving a set of equations in which 
one unknown, say x„, may be chosen arbitrarily and the others written as 
functions of x„ In this case there are nonzero solutions to Equation (5) 
Together with Theorem 2 these considerations yield the following theorem 

Theorem 3 If A is a square matrix, the set of linear equations 
AX= O 

has a nonzero solution if and only if the determinant of A is zero ► 


Example 4 In Section 8 4 we saw that a vector X was an interior equi- 
librium for a closed Leontief system with input matrix A it 


Rewriting this equation as 

X — AX - O or (/ — X)X = O 

we see that an interior equilibrium exists if and only if det (/ A) , 
that is, if and only if / — A is singular 

We shall state and use without proof the following result 

Theorem 4 All statements concerning rows in Properties 

mutant function and Theorem 1 hold for columns as well In addition, if A 

is any square matrix, then 

det A = det A' ^ 




7 Use Cramer s Rule to solve the simple national income model 

Y = C + Jo 4* (7o 
C = a + bY 

where Iq, Go, a, and b are constants 

8 Let £/,(i = 1, 2, 3) denote the utility of outcome i Solve for hi, « 2 , and 
H 3 in the linear model 



9 Determine two different solutions for the linear equations 

2x + 4y — 6z - 0 
3x — 3y + 5z = 0 
4* - y + 2z = 0 

10 Use the results of Example 4 to rework Problem 12 of Section 9 1 

H Use elementary row and/or column operations to establish the following 


W 


(b) 


(0 


J + I r + t r + s , 


~y *-y *2 

1 1 


>'■ 
x + y 


^1 + z i 

21 + *l\ 

/XI 

*2 

X3\ 

i2 + Z 2 

22 + *2 

= 2 det ( yi 

>■2 

^3 ] 

^3 + Z 3 

2-5 + * 3 / 

V. 

22 

23/ 


dct ,i _ q lwo rows (columns) of matrix A are proportional, then 
13 (a) Show that for the 2 X 2 matrix 

"■(::) 

m agrLiniLnr* °k c* 0 t * ctcrminant function give dct A =* ad — be, 
“Srecnicnt with Equation ( 1 ) 

(b) IV °' C Thcon -m 4 for 2 X 2 matrices 

252 / CHARACTER, STIC EQU ATI O N S 



10 2 CHARACTERISTIC ROOTS AND 
CHARACTERISTIC VECTORS 

The operation of multiplying a vector (row or column) by a matrix to pro- 
duce a new vector has a variety of interpretations important for both theory 
and applications 


Example 1 The point X = col (xi, x 2 ) lies at a distance r -\Ai ! + 

from the origin. Let a be the angle between the horizontal axis and the line 
from the origin to X (see Figure 1) From elementary trigonometry we know 
that X2 

cosa = — and sin a = — 
r r 


FIGURE 1 


y 



nomctnc formulas we find 


and 


= COS/} = cos (a T - 0) = cos a cos 0 - sin a sin 0 

= ^icos d-^in* 
r r 

±1 = s,n (3 = sin (a + 0) = cosasin 0 + sina cos 0 


= ilsinM- -fcost 


CHABACTEBIST1C HOOTS AND V 


ECTOBS / 2S3 


The two equations 

yi = *i cos 9 — x 2 sin 9 

y 2 — X\ sin 9 + x 2 cos 9 

may be written as the single matrix equation 

d /cos 9 — sm 9\ /x\\ 

)( ) = AX (6) 

\sin 6 cos 9/ \x 2 / ► 

In Example 1, the matrix multiplication (6) is interpreted as producing a 
rotation of the vector X through an angle 9 Geometrically, the same result 
is obtained if the axes are rotated through an angle -9 This point of view 
is illustrated in the next example 


Example 2 In factor analysis (see Section 8 2), the elements in the yth row 
of the factor matrix F indicate the “loadings” of the yth test on the various 
actors (that is, the degrees to which the yth test measures the various factors) 
ror instance, in the matrix 

/oi 0 2\ 


F- 05 04 


V " v V 

factors' th ° BVe totS KaS a PP roxlma Wy equal loadings on each of the two 

These Zsl! °i f of ,he factor matrix (7) as points in the plane 

points are plotted ,n Figure 2 Multiplication of /by the transpose of 

R = /cos (-45°) —sin (— 45")\ / 0 7071 0 707l\ 

\sin (—45°) cos (-45°)/ \-0 7071 0 7071/ 

rotates the axes 45" and produces the new factor matrix 

( 0 2121 0 0707\ 

0 4243 0 

0 6364 —0 0707 

0 9192 0 0707 

.0 9899 -0 1414/ 

onginal'oncs ‘soT/t " V ° " CW ract0rs - are combinations of the 

original ones, so as to hare relatively large loadings on the first of these 

254 / CHARACTERISTIC EQUATIONS 



FIGURE 2 



factors and small loadmgs on the second Each test .may thus be regarded 
as essentially measuring a single common fac or a pp ^ 

complex btfactor interpretation suggested by t e origin 


Of particular interest relative to a square matrix ^ are those nonzero vec 
tors X for which there is a constant X such that 

(8) 

The number X in (8) is called a ^ ro ^"shc "arl Ito called 
called a characteristic vector of A ( ... t r00ls •• characteristic 

“characteristic values,” “eigenvalues, vectors ” or “invariant vec- 

vcctors are also called “eigenvectors, , , oro duccs a new vec- 
tors ”, Equation (8) says that mulUptofon to X by * by 

tetth^»p«amrfu^of ^^-J^-” J mc stralght line 
A produces no rotation at al b “ .. , 1} „r shrinking it (|X| < I), 

!md perhaps reversing ^ direction'^ < 0) W Example 1 of Scct.on 8 2 ) 

Example 3 Ke.at, vctc .Equation^ 

0 is zero or a multiple of r (a number °* 

(0, 0) can be transformed into a multiple o 1 

c „ en f u- t t closed Lconticf s) stem with input 

Examplo 4 Recall (Section S 4) that a ci am | 

matrix a w .nr,„nr tauilibnum X If and only if X > u ana 


matrix A has an interior equilibrium A it unu 
AX “ X . 

102 CHARACTERISTIC 


\*JD VECTORS / 2M 



This corresponds to Equation (8) with X = l In economic terms, (9) repre- 
sents the situation where production equals input demands for each com- 
modity ► 


The problem of finding characteristic roots and the corresponding charac- 
teristic vectors is a problem of solving Equation (8) for X and for X, subject 
to the restriction that X be nonzero If we rewrite (8) as 

(A - \I)X = O 

and apply Theorem 3, we see that X must be chosen so that 

det ( A - X/) = 0 0°) 

Equation (10) is called the characteristic equation for the matrix A and the 
determinant det (A — X/) is called the characteristic polynomial for A If 
A is an n X it matrix, this is an nth degree polynomial in X It follows that 
an n X it matrix has n (not necessarily different) characteristic roots 


Example 5 The characteristic equation det (A - X7) = 0 for the matrix 

-GO 


*,(■ 


■ x 2 - 2\ - 3 = 0 


1 - X 2 \ 

2 I — X/ 

The solutions of (11) provide the two characteristic roots 
Xl = -1 and X 2 = 3 

Solving the equation 

2 ' 


(ID 


(4 - X,/)Jf , 


COCK) 


. V *•/ \-v 2 / w 

muit,;,Lif a L ch :™orco, c ( -, ct i r ) s f \ c ° rres 1 pondmB i ° ^ 7 -i of ” e 

which correspond to X u SimiIarI y» characteristic vectors of A 
constant These are found bytl™^ ^ * = C °‘ ( °’ C)> Where “ ‘ S S ° me 

(A - 3/)T = O ► 

PROBLEMS 

'• Venry ihc compulations in Example 5 

unms of c is fr ° m 1 matnx c °y eliminating various rows and col* 

matrix is called a m, ,m,nx ° r c The determinant of a square minor 
of a square mmor f ,crm "! an ‘ »f C If C is square, the determinant 

nx formed by choosing elements in the same numbered 

255 / CHARACTERISTIC EQUATIONS 



rows and columns is a principal minor determinant of C The following 
theorem, which we do not prove, tells how to write the characteristic equa- 
tion of a matrix in terms of its principal minor determinants 

Theorem 5 Let p, denote the sum of all k X k principal minor determi- 
nants of an n X n matrix A Then 

det {A — \1) , , , . v , w 

= (-x)" + Pl(-X)" -1 + P 2 ( _x )” + + P»-l( X ) + P" ► 

Example 6 For the matrix 


/' A 
1 = 1 
\2 1 / 


in Example 5, the 1 X 1 principal minor determinants areformed by choos 
mg each of the elements on the mam diagonal in turn There are : two i such 
determinants, each with value 1 The only 2 X 2 pnnapd 
minant is detX According to Theorem 5, the characteristic polynomial of 

det (X - X/) = (-X) 2 + 0 + *X-X) + det A 

= X 2 - 2X - 3 ^ 

as before 

Example 7 The matrix . 

has first-order principal minor determinants 

det (3) = 3 det (2) - 2 and det (5) = 5 

second-order principal minors 


ma-oraer principal minuia 

:)=» - -c 


10 


\-l 2/ V v ' 

, , , Af = 0 The characteristic poly- 

a nd the third-order principal minor det l 

nomial is 

det (A/ - X/) - (-X)3 + O + 2 + 5)(-X) 2 + 0° “ 5 + ™- X) + ° 
The characteristic roots, found by solving det (Af x ) 
x z = 5 -f VTO, and X 3 = 5 - 

t _ , r nr , closed Lcontief sjsttm to have 

In Example 4 we saw that in °«‘ cr f “ ^ characteristic root of the 
an interior equilibrium, the number 1 must dc 

STIC ROOTS AND 


102 CHAR ACTE R l 


VECTORS / 257 


▼ p 



corresponding input matrix A and there must be a non-negative characteris- 
tic vector corresponding to this root This is not always the case A general 
result which is useful m this context is the following theorem, which we do 
not prove 

Theorem 6 Suppose A is a square matrix with non-negative entries Then 

(a) there is a unique positive characteristic root r which is larger in ab- 
solute value than any other root 

(b) the root f in (a) lies between the largest and smallest of the row (or 
column) sums of A 

(c) there are positive characteristic vectors corresponding to r ► 

From this theorem, it follows that if the row (or column) sums of A are 
all unity, a closed Leontief system does indeed have an interior equilibrium 
For instance, the input matrix 

( 03 02 0 1 \ 

06 03 01 ] 

01 0 5 0 8/ 

has unit column sums and characteristic roots 

1, 0 2 + V005 « 0 4236, and 0 2 - V005 « -0 0236 
Any characteristic vector X associated with the root 1 has the form 
X = col (&C, c, £§c) 
which is strictly positive whenever c > 0 


problems 

In each of Problems 2-9 determine the characteristic roots and associated 
characteristic vectors of the given matrix 



.6-3 6 3/ 

/ CH A R ACTER1 STI 


(■; ;) ' (,; ;) 


C EQUATION S 



10. Find the characteristic roots of 


(a) 




11. What are the characteristic roots of D = diag (ai,« 2 , 


.«n ) 9 


12. Find three different 2X2 matrices whose characteristic roots are 1 and 4 


13. Calculate the characteristic roots for the matrix 



Under what conditions are the roots 

(a) real and unequal 9 (b) real and equal 7 (c) complex 7 

14. In Problem 2, the characteristic equation was found to be X 2 — X ~ 
6 = 0 Verify that 

C: ;r: M: K :) 

In a sense, this matrix satisfies its own characteristic equation 

15. In Example 2, rotate the points by 60° and compute the new F* matrix 

16. The rank of a matrix A is the order of the largest order nonzero minor 
determinant of A The rank of a zero matrix is zero For instance, the 
matrix M in Example 7 has rank 2 since there are 2 X 2 minor determi- 
nants which are not zero, but the only 3X3 minor determinant, det M, 
is zero 


(a) Determine the ranks of the matrices in Problems 2, 6, 7 and 9 

(b) Determine the ranks of 

j\ 0 0 0 \ 

and 


/0 0 0 \ 
\0 0 o) 


0 0 0 
.0 10 0 , 


17. Argue that performing elementary operations on a matrix cannot 
change the rank of that matrix 

18. Suppose that a system of four linear equations in seven variables has the 
coefficient matrix partitioned as shown (the a t j are all nonzero) 


/ill 

o 12 

013 

Oi4 

{ 0 

0 

-A 

A =| 021 

0 

023 

0 

| 

} 025 

l 

020 

•V 

\ ° 

032 

033 

0 

{ 035 

036 

0 / 

Vu 

042 

0 

044 

t 

\ 045 

0 

047/ 


m 


Use the result of Problem 17 to argue that the first equation is identifiable 
(see Section 9 2) if and only if the 3X3 matrix D has rank 3 


102 CHARACTERISTIC ROOTS AND VECTORS / 259 



10 3 SYMMETRIC MATRICES AND 
QUADRATIC FORMS 


A matrix A = (a tl ) is symmetric if it is equal to its own transpose, that is, if 


A 1 = A 


( 12 ) 


For example, the matrices 




1 

03 

02 

°\ 

03 

1 

06 

-o>\ 

02 

06 

1 

02 J 

0 

-0 1 

02 

1 / 


are symmetric 

In terms of rows and columns, Equation (12) says that the first row and 
first column of a symmetric matrix must contain the same entries in the same 
order, the same is true of the second row and second column, etc In terms 
of elements, (12) implies that the element = a )X in the ij position of 
A ' must equal the corresponding element a t} of A Thus if A is symmetric, 
we have 


a }l = a XJ 

for all i and j The elements which are symmetric relative to the main diagonal 
must be equal, as in the above examples 


Example 1 In Example 6, Section 8 2 we defined a correlation matrix 
R = ( r jfc) m which an entry r }k denoted the correlation between tests j and 
k Since by definition 

r * = \ Z s :Ak> = t £ S t ,S„ = r k , 
the matrix R is symmetric 


Example 2 Define a relation flona set of research projects by iRj if and 
only if project i is related to project j Since R is a symmetric relation (see 
Section 3 2), its adjacency matrix C = ( c t] ) defined by 



must be a symmetric matrix 


if iRj 
otherwise 


► 


If A (a tJ ) is an»Xn symmetric matrix and X = col (xi, * 2 , > •*»») ,s 

an n dimensional vector, the expression 

Q(X) = X'AX 


260 / CHARACTERISTIC EQUATIONS 


( 13 ) 



is called a quadratic form in the variables Xi, x 2 , . . . , x n . Performing the mul- 
tiplication in (13) we find 


X'AX= 


/ a ll a 12 
a 2\ O 22 


\O n 1 #n2 • • • O, 

( ZU aux, 

Z^-l a 2,X, 

S”-l “ n ,X,j 

= Z X. ( Z = E t a.**,*, 

t=»l \j-l / 1=1 3=1 



For each ij pair, the coefficient of x,x^ is the element in the ij position of A. 


Example 3 If 



then the quadratic form determined by A is 


Q(X) = X' AX 

= (X1 ’ X2) G a)£) (14> 

f Xi + 2x 2 \ 

-fr-^U+w 

= xi(xi + 2x 2 ) + x 2 (2xj + 3* 2 ) 

= xf + 2xjX 2 + 2x 2 Xi + 3xl (151 

= x\ + 4*1*2 + 3*1 ( 16 ) 

Note carefully how the elements of the matrix A enter the sum (15). ► 


Example 4 Suppose, in the factor analytic setting of Example 6, Sec- 
tion 8.2 we wish to weight the tests differently. Let iv, be the weight assigned 

to test j and let \V — col (wj, w 2 , , w t) he the vector of these weights. 

Then the composite score y, for individual i is 

y t — Wi^i» 4- a’ 2 j 2l 4* ' ' * + WtSu 


10.3 SYMMETRIC MATRICES / 261 



computed by multiplying the observed test scores for individual i by the test 
weights, and summing In matrix notation 

Y = W'S 

where Y= (yj,j> 3 , , jO is the row vector of composite scores and 

S = (s }t ) is the matrix of observed scores 
The average squared composite score (in statistical terms, the “sample 
variance” of the composite scores) is 

ar = - Y, y* = - YY' = - W'SS'IV 
n “ 7 n it 

Since SS 1 = nR, where R is the symmetric correlation matrix, we have 

iy - - W'nRW = W'RW 
n 

a quadratic form in the weights ^ 


If we began with Equation (16) m Example 3 and wished to put it into 
matrix form, we could simply reverse the steps taken in going from (14) 
to (16) The reason for going first to (15) is to make sure that the resulting 
form involves a symmetric matrix For instance, the matrix product 

M (o X‘) 

is exactly the same as (14), (15), and (16), but the coefficient matrix in the 
product is not symmetric 

One major reason for using only symmetric matrices in representing qua- 
of the form* ” ^ “ C ' ,a " Se °f vanable - or hnear transformation, 


x = cr 

where C is a nonsingular n X n matrix, so chosen that Y may be expressed 
in terms of X as Y = C~ X, the quadratic form (13) becomes 

X'AX =■ (CY)'ACY = Y'fC'AQY 

This is a new quadratic form involving the vector Y = col (y., y 2 , , y») 

If A is symmetric, that is, if A' = A, then 

(CAcy = c'A'(cy = cac 

so that the matrix C'AC of the new form is symmetric also 


Example 5 If A is the symmetric matrix 


A = 



and 



262 / CHARACTER! 


STIC EQU ATI O N S 



then v . . 

;)G Dn-C-D 

is also symmetric. ^ 

The simplest quadratic form is one involving only squared terms. The 
matrix of such a form is diagonal. It is an important fact that by making an 
appropriate change of variable, any quadratic form can be changed into a 
sum of squares. In order to tell how, we need some special terminology. 
Two n-dimensional vectors 

X = col (xi, x 2 , ■ - • , *„) and T = col 0>i,y> 2 y„) 

are called orthogonal if 

X'Y = Xiyi + x 2 yi -i f **)'» = 0 (17) 

That is, X and Y are orthogonal if multiplying corresponding elements of 
the two vectors, and adding, gives zero. For instance, X = col (2, 1,-1) 
and Y = col (1, 2, 4) are orthogonal since 

X'Y= (2)(1) + (1)(2) - (0(4) = 0 

A matrix is called orthogonal if each row (or column) is orthogonal to every 
other row (or column) and if in each row (or column) the sum of squares of 
the elements is unity. That is, the rows (or columns) are orthogonal to each 
other and are each of unit length. Note that one matrix is orthogonal, but 
two vectors arc orthogonal. For instance, the rotation matrix 


( cos 0 -sin «\ 
sin 0 cos 0/ 


in Example 1 of Section 10.2, is an orthogonal matrix. The most important 
property of orthogonal matrices is the following theorem. 

Thoorom 7 The inverse of an orthogonal matrix is the same as its trans- 
pose. Symbolically, if Q is orthogonal, then 

G' = Q~ l 

PROOF Since the jilt column of Q' is Ihejth row of Q, the ij clement in the 
product 

QQ' 

is the sum of the products of corresponding elements in the ith and Jill row s 
of 0. The definition of orthogonal matrix requires Hut this sum of products 
is «ro if i A j and one if i “ j • Hence QQ •» / an & " Q 

Thoorom 8 If A is a s> mmctric matrix, then there is an orthogonal matrix 
Q such that D - Q'AQ is a diagonal matrix. The diagonal entries m D arc 


taa 


SYMMETRIC MATRICES / 



the characteristic roots of A The columns of Q are characteristic vectors of 
A corresponding to the respective characteristic roots in D ► 


We omit the proof of Theorem 8, contenting ourselves with some examples 
of its use 


Example 6 The characteristic roots of the symmetric matrix 



are Xi = 3 and X 2 = — 1 Characteristic vectors corresponding to these 
roots have the respective forms 

Vi = a and V 2 = b 

where a and b are any constants Note that the vectors are orthogonal, re- 
gardless of the values of a and b The sum of squares of the elements in each 
vector will be unity if we choose a =* b = l/\/2 Then, according to Theo- 
rem 8, the orthogonal matrix 


is such that 



Q'AQ = D = diag (3, -1) 

Equivalently, if in the quadratic form 

X'AX 

" e make the change of variable X = Q Y, we obtain the new quadratic form 
r(Q'AQ)Y = Y’DY = 3j>?-yf 

a sum of squares w 


PROBLEMS 

1 Work out the computational details to obtain the results in Example 6 


Example 7 Factor analysis is an unportant tool in the behavioral sciences 
and is based on matnx concepts which we have introduced We consider 
various facets of factor analysis here 


(a) Theorem 8 guarantees that, gtven any correlation matrix B, there is a 
matrix F of factor loadings such that 


R =. FF' 


264 / CHARACTERISTIC EQUATIONS 


( 18 ) 



To determine F, we first find an orthogonal matrix C such that 

C'RC = D = diag (X,, X 2 X„) (19) 

where X,- are the characteristic roots of X 1 . Writing (19) in the equivalent form 
CC'RCC = CDC' 
and invoking Theorem 7 gives 

X = CDC' 


We now define the square root D 112 of the matrix D by 
Z)>' 2 = diag(X}' 2 , 

Then 

R = CD 1/2 (0 ,,2 )'C' 

and we take 

F = CD 112 

(b) Choosing F = CD 1,2 as in (a) gives R = XT'. From Example 6 in 
Section 8.2 we know that the matrices S of observed scores and F of standard 
scores are related by 


S = FP 


( 20 ) 


and that, by definition, 


K = isS' (21) 

n 

If the matrix F is square (that is, the number of tests equals the number of 
factors) and nonsingular, then the matrix Z of interfactor correlations is 

Z = A/>f» = ^[F -1 SJ[S'(F') -! ] [by (20)] 

= F- , (is5')(r- , y 

= [by (21)) 

= / [by (18)] 


In this case, the factors arc uncorrclated. 

(c) As a numerical illustration of the factorization of a correlation matrix, 
consider 

/l 0.6S 0.45\ 


<0.45 0.56 1 / 

103 SYMMETRIC MATflICCS / 24S 



The characteristic equation of R is 

det (fi - X7) = (-X) 3 + (l + 1 + 1)(— X) 2 

+ (0 5376 + 0 7975 + 0 8704)(-X) + 0 4258 
= -X 3 + 3X 2 - 2 2055X + 0 4258 
= 0 

Solving this equation we find (to four decimal places) the characteristic roots 
\i = 2 0067 X 2 = 06823 X 3 = 03110 
Corresponding characteristic vectors having unit length are 

/0 627 \ / 0 243\ /-0 740\ 

ATi = I 0 600 ] X 2 =l 0 454 ) X 3 = { 0 658 J 

\o 496 / \— 0 857/ \ 0 140/ 

If we use these vectors as columns of an orthogonal matrix C and write 
D = diag (Xj, X 2 , X3), we find 

F = CD 112 = C diag (Xl' 2 , xj' 2 , X|' 2 ) 

/ 0 627 0 243 -0 740\/l 417 0 0 \ 

= ( 0 600 0 454 0 658 )10 0 826 0 ) 

\0 496 -0 857 0 140/ \0 0 0 558/ 

/ 0 888 0 201 — 0 413\ 

= I 0 850 0 357 0 367 ) 

\0703 0 707 0 078/ ^ 

If A is symmetric and Q'AQ - D is a diagonal matrix containing the 
characteristic roots of A, then the linear transformation X - QY changes 
the quadratic form X'AX into a sum of squares For 

X'AX = (Y'Q')A(QY) 

- Y\Q'AQ)Y 
= Y'DY 

In summation form, 

X'AX - Y'D Y - X, r ? + X 2 >| + + X n y" = £ Xj 2 (22) 

1 — 1 

where \t, , \* arc the characteristic roots of A 

If all the X, arc positive in (22), then no matter what values for X arc in- 
troduced, the quadratic form cannot be negative. Such a quadratic form is 

268 / CHARACTERISTIC EQUATION S 



called positive definite Similarly, a quadratic form is called negative definite 
if all X t are negative, and indefinite if some X, are positive and some negative 
In the former case, the form has negative values regardless of the choice of 
X, while in the latter, some choices of X make the form positive and some 
make it negative 


Example 8 The correlation matrix R in Example 7 is positive definite 
since all characteristic roots are positive The matrix A m Example 6 is 
indefinite ► 


PROBLEMS 

In Problems 2-5, express the given quadratic form as a sum 


m 30 3 ’ (u>c) 0- 



(*i,X2,x 3 )( 4 4 2 
V2 2 I, 


(yu y2, >3, yS 


In Problems 6-9, express the given quadratic form as a matrix product 
6 2 ur - 8 up + 6l 2 7. 3x 2 -f I6xy - 2> 2 

8 12.ty - Ax 2 - y 2 9 x 2 + 2 y 2 + r 2 + 2.r> + 2xz 

10. Verify that if A is the symmetric matrix 

Oil:!) 

and R is the matrix 


/ l/v'S 

I/V3 

\/Vi 

( — 2/\ / 5 

1/V3 

0 

\ lA'S 

1/vA 

— l/v^S 

ihen R'AR - diaist*. —2, —2) 




10 3 SYMMETRIC MATRICES / 2*7 



In Problems 11-14, find an orthogonal matrix Q such that Q'AQ D, 
where £> is diagonal 



(Hint The number zero is a repeated characteristic root Make sure 
the two characteristic vectors you choose are orthogonal ) 

16. Verify that the C matrix of Example 7 is indeed orthogonal 

17. As in Example 7, factor the following correlation matrices finding (1) the 

characteristic roots, (2) the orthogonal matrix C, and (3) the factor 
matrix F Verify that FF‘ = R ( Hint In (a), 2 is a root ) 


(a) / 1 

-04 

-0 68\ 

(b) /l 

0 6 0 8\ 

R - [-04 

1 

04 ) 

VO 

o 

* 

1 ° ) 

\-0 68 

04 

■ / 

\ 0 8 

0 1 / 


18 Determine which of the quadratic forms m Problems 2-9 are 
(a) positive definite (b) negative definite (c) neither 

19. A matrix B is said to be skew symmetric if it is equal to the negative of 
its transpose, that is, if B' = — £ The matrices 



are skew symmetric On the other hand the matrix 

CD 

is not skew symmetric Why* 1 

20. If the r X r matrix A is symmetric (or skew symmetric), and P is any 
r X c matrix, prove that P'AP is symmetric (or skew-symmetric) 

21. Prove that if P is any r X e matrix, then P'P is symmetric 


268 / CHARACTERISTIC EQUATIONS 



22. Must all symmetric and skew-symmetric matrices necessarily be square’ 
Why or why not’ 

23. Prove that if A is symmetric (or skew-symmetric) and if c is any scalar, 
then cA is symmetric (or skew-symmetric) 

24. Prove that if A is any square matrix, then S — A -f- A' is symmetric and 
T — A — A' is skew-symmetric 

25. It is sometimes useful to diagonalize matrices which are not symmetric 
In this case the matrix of characteristic vectors is ordinarily not orthogo- 
nal For instance, suppose we have a closed Leontief system with input 
matrix 



where a and fi lie between zero and one Suppose further that for t ~ 0, 
1, 2, the vector X(t + 1) of system outputs in the (/ + l)st time 
period is related to the vector X{t) by 

X(t + 1) = AX(r) 

(a) Prove that for all t, 

X(t) = A l X( 0) 

(b) Find the matnx D = diag (Xi, X2), where Xi and X 2 are the charac- 
teristic roots of A 

(c) Find a matrix C — (Vi, Vi) where V\ and V 2 are characteristic vec- 
tors corresponding to Xi and X2 

(d) Verify that C~ X AC ~ D and hence that A = CDC~ l 

(e) Prove that A * = CD l C~ l 

(f) Prove that 

!™«'>= c (() o) c "* (0 > 

and hence that this limit is 

a + p \a a / 


supplementary READING 

Hohn, F E , Elementary Matrix Algebra (Macmillan Company, New York, 1964) 

Thurslone, L L , Multiple-Factor Analysis (University of Chicago Press, Chicago, 
1947) 

Zelinsky, D , A First Course in Linear Algebra (Academic Press, Inc , New York, 
1963) 


SUPPLEMENTARY READING / 269 




PART III 


Calculus 



FUNCTIONS, LIMITS, AND CONTINUITY 11 


11.1 REAL-VALUED FUNCTIONS 

Recall (Section 4 1) that a function associates with each element of a certain 
set, called the domain of the function, an element of another set, called the 
range of the function If the function /associates the element b in the range 
of/ with the dement a in the domain of/, then b is called the value of/ at 
U 'h'.i WC wrlte b ~ ^ function whose range is a set of real numbers is 

re^du“™s /lmC, “" ° Ur S ‘ Udy ° f “ W ‘“ b = w “ h 

2 , a 5 e ^ ra .P bs °‘ r two common real-valued functions, the 
we A u and th= SgUa ' e roal S‘mcl,<m/(x) = Vx Here 

shorthand method of indicating a real-valued function 

function g is these" COmpU “" g ltS VaIues Stnctl >' speaking, the squaring 

S = {(x,.v) x is a real number and y = x 2 } 

tactmTl 1 ^ - f e 2 pre f'? n ' “ ls Coramon P ract| cc instead to speak of “the 
tween a X , As °" 8 as we kecp ln mlnd the distinction be- 

this simplification 8 " "* ValUCS ^ SeC “° n 4 n ° harm should come from 
de!m^ ad ° P r the invention that the domain of any real-valued function 
mula male a T, “ the !arges ‘ Set of rcal for which the for- 

”f the fu^c rr ‘ 1S /- produc,:s a "°ther real number) Thus the domain 
bers sm c " C ‘‘ 0n i /W = ^ “ F ‘f^ e 2 * ‘he set of non-negative real num- 
Zo f 17 nUmb T d ° " 0t have s 5 uare ™ts which are real numbers 
It is important to note that the symbol used to identify a function, whether 
J, g, «, or some other symbol, ,s an arbitrary one The important factor is 

272 / FUNCTIONS. LIMITS. AND CONTINUITY 



FIGURE 1 Squaring function 



FIGURE 2 Square root function 

f 



the formula itself. The symbol used to item* a member 

the function is also arbitrary. Thus, for example, ,he notat ‘°same funchon’ 
o (x ) _ ^r x hM _ Vy, and m(0 = all descr.be the same function, 
vx ’ "W . T _- rh case the formula assigns to any 

namely the square root function. In eacn ■ <- niiqre 

„ u •* 4 Thf* <=et of ordered pairs which comprise the square 

“«Jc a an r be'described by any P of the above notations and in 

this sense they are identical. , • ^ Manv 

T . . i. 0 ,.n r ni rpal-valued functions are shown in rigure Many 

The graphs of several real valueo i s mathematics studies. To 

of these graphs should be familimr fr P of th( _ subjcct mattcr is dc _ 

bcg.n a study of calcu us a fubstant. P related functions, 

voted to a thorough 
These functions are sometimes rcicrrcu 

„ . REflL _VALUED FUNCTIONS / 273 



PROBLEMS 

1. Draw graphs of the following functions. 

(a) constant function /i(t) - 2 

(b) constant function F(v) = c, where c is a real number 

(c) (2 for *<0 
sW=, U for .c > 0 

Example 1 This example is designed to emphasize the use of functional 
notation. Suppose /is a function which assigns to any real number x (^0) 
the value 

f{x) = - 1 + i (’> 

Then f assigns to the number y the value 

f(y ) = y 2 ~ 1 + p 

and / assigns to 3 the value 

/(3) = 3 2 - 1 + i = 

Similarly, 

fix - 1) - (x - l) 3 - 1 + 

fix') = (x-f -1+^ = ;(4_1 + ^ 

Thus, we see that in order to compute the value which/assigns to any quan- 
tity Q, we simply replace x by Q in Formula (1). The quantity Q may or 
may not be an expression involving x. 

PROBLEMS 

2 The domain of each of the following functions is the set R of real num- 
bers s“etch the graph of each function by plotting a few points and 
drawing a smooth curve through the points. 

fat f(x\ = 4x 00 s(0 “ 3 - 2 ' 

(e)G(y)=y*-6y + 2 (0 H(x) = 2x-x 2 

(g) a(s) = s 2 - 4 

3. Indicate the range of each of the functions in Problem 2. 

. , . r u, ^raicht line passing through the points (2, 8) 

^12)"“ Every straight line has the functional form/W = 
ax + b, where a and b are constants ) 


VALUED FUNCTIONS / 275 



FIGURE 3 Graphs of some familiar functions 

(a) straight line function, (b) sine function, 

(c) cosine function, (d) exponential function, 

(e) natural logarithm function, and (f) tangent function 



274 / FUNCTIONS. LIMITS. AND CONTINUITY 



(d) Although its limit is 1, the sequence (1, 1, 1, ) is not one-deleted, 

since 1 is a member of the sequence ► 


PROBLEMS 

1. Which of the following are 2 deleted sequences 7 

(a) (2, 2, 2, 2, ) (b) (3,2,1, 1|, If, l£, ) 

(c) (1. 1?, 1 b lit, ) (d) (1 , 1 1 h tV ) 

The procedure to be followed in defining limit is best introduced by means 
of examples 

Example 2 The graph of the function / defined by 

! 2 when x < 0 

1 when x > 0 


is shown in Figure 4 Any zero deleted sequence (*i, x 2 , X 3 , ) of nega- 

tive numbers generates a corresponding sequence of functional values 
(/(*i)>/(* 2 )>/(* 3 )» ) — (2,2,2, ), all of which are equal to two 

The number 2 is the limit of alt such sequences of functional values Similarly, 
if (yi, y 2 f y 3 i ) is a zero deleted sequence of positive numbers, then 
(/ 0 , i)»/ 0 ? 2 )»/ 0 , 3 ), ) is the functional value sequence (1, 1, 1, ), 

whose limit is 1 


FIGURE 4 


/ 



>2 

X\ *2 X 3 


y3 y2 yi 




Not all zero deleted sequences of points in the domain of /generate con- 
vergent sequences of functional values For instance, the sequence 

(1. - I,*. ) 

generates the sequence of functional values 


which does not converge 


( 1 , 2 , 1 , 2 , 1 , 2 , ), 


► 

112 LIMITS / 277 



5. Lct/C*) = 2x + 6 and g(x) = 4 jc — 12 

(a) For what number x is f(x) = g(jr ) 9 

(b) For what numbers x is /( x) < g( x) 7 

(c) Draw a graph to illustrate (a) and (b) 

6. (a) Find the values of x for which 


fix) = x 2 + 3x + 2 = Q 

Draw a graph of/ on which these values are displayed 
(b) Same problem for the function g(x) = x 2 - 6x + 8 

7. The Spearman-Brown formula (see Example 4, Section 5 3) for reli- 
ability of a test having initial reliability r, which is then lengthened by a 
factor of n , is 


(a) Compute / (/i 2 ) 

(b) Compute f[f{n)\ 

(c) Show that the function g defined by 

S(M) = ”[750 - *] 

!he value of » fU " C, ' 0n ' ,ha ‘ ' S ' SW h ° S the same value re S ardkss of 


11.2 LIMITS 

rp for arb,trary f ™ c - 

our terminology, we shall call a * f ° r * ec l uences In order to simplify 

who, wh,ch ™r ress to a numbe ^- 

^ ) » a p-dclctedscqucnce^f 1 * ^' K " Ce That 1S ’ 


Jim (*„) = p and for all 


* ^ P 


Example 1 (a) The sequence (1 x x i s 

smee zero is not a member of the sequence' and f r °* ddeted SCquen f 
zero t 4 ucnce and the sequence converges to 

(b) The sequence 

("i. —i, i, — L 1 _ i \ 

V 2- 2-” 3-" 3J- • ) 

is also a zero-deleted sequence 

(c) Tlic sequence (2, 14, 11, H ij \ 

““ ,hC 1,m '‘ 15 0 » ». howclcr," 

218 / FUNCTIONS. LIMITS. AND c 


ONTINU ITY 



FIGURE 6 


h 


2 


1 


x 


Example 5 (a) The discussion in Example 2 implies that lim a ._*o /(*) 
does not exist since there is no common limit for all sequences of functional 
values generated by zero deleted sequences of points in the domain of f 

(b) In Example 3, lim n _»« g(x n ) — 5 whenever (x n ) is a 4 deleted sequence 
Hence lim a ._ + 4g(x:) = 5 

(c) In Example 4, lun x _* 0 /i(x) = 1 since every zero deleted sequence of 

points m the domain of h generates a sequence of functional values con 
verging to 1 ► 

Example 6 If h is a constant function defined for all numbers x by 
/i(*) = c 

where c is any real number, then every p deleted sequence of points (x u x 2 , 
*3. ) generates a corresponding sequence of functional values (c, c, c, ) 

all equal to c Since every such generated sequence converges to c, we see that 

lim Ji(x) = c 


regardless of the value of p ► 

Example 7 Let I denote the identity function 

m = * 

Then, any p deleted sequence of numbers x 2 , *3, ) produces the 

corresponding functional value sequence 

(7(*i), /(v 3 ), ) = (x u x 2 x 3 , ) 

of functional values, which also converges to p That is, 

lim 7(a) = hm x — p ► 

x—p x~*p 


112 LIMITS / 273 


Example 3 If g(x) = 2x — 3 and (x lt x 2 , x-i,. ) is any 4-deIeted se- 
qurce, then the limit of the corresponding functional value sequence is 

lmi g(* n ) = lint (2x„ — 3) = 2 4 — 3 = 5 

E\cry sequence (v„) of points in the donum of /which converges to 4 gen- 
erates a corresponding functional \alue sequence (g(A n )) in the range of g 
which comerges to 5 (sec Figure 5) Note that in this case the common 
limit 5 is the \alue of g at the point 4, that is, g(4) = 24 — 3 = 5. ► 

FIGURE 5 


s 



h(x) = 


when x j* 0 
when x = 0 


or funttRm.iU r akut i; (l Ltl [ St I lUCn ) C ii 1 c Jc C |lf 0main ° f/ ' 8cncrJlcs tl,c sct l ucnc ‘ : 

' J|UC ^ functional value ~ 


““I'v"'.'', 0 " 1 n‘f / ^l rC ' ,l ',V ,U r lfUnC, ' 0n ^ ** or every yj-dclctcd sequence 
value vci(ucncc (f(x t ) /(*") fi x ° nu ' nof f' ‘>w corresponding fundional 


lun /(a) » L 


!v f U1 vl uiiianj notaliun av] 


lim / - /. 


”* / ,U!,ci 'ONS. LIMITS. AMO COMTIHUl 



expect this new concept of limit to have the same properties as the concept 
of limit for sequences (See Property 4 for sequences in Section 5 3) 


Theorem 1 Let p be a point, common to the domains of functions / and 
g, for which 

lim f(x) ~ L and lim g(x) = M 

x-*p x-*p 

Then 


lim (/ + g)(*) - L + M 

X -*p 


In words, this says that the limit of a sum is the sum of the limits 

PROOF Our assumptions imply that if (*i, x 2i x 3 , ) is any p-deleted 

sequence, then the functional value sequences (/ (x i), (* 3 ), ) and 

G>(*i)»g(-X 2 )>£(-X 3 )» ) converge, respectively, to L and to M Hence, by 

Property 4 for sequences (see Section 5 3) and by the definition of f + g, 
the sequence 

((/ + £)(*i)» (/ + g)(x 2 ), (/ + *)(* 3 ), ) 

converges to L + M ► 


Other properties of limits of functions which are algebraic combinations 
of other functions follow in similar fashion from the analogous properties 
for sequences Therefore the proof of the following theorem is left as an 
exercise 

Theorem 2 Let p be a point, common to the domains of functions/ and 
g, for which 

lim f(x) = L and lim g(x) = M 

x~*p x—*p 

Then 

( I ) hm (f- g)(x) = L- M 

X-*P 

( II ) lim (/ g)(.x) = L M 

x-*p 

( III ) If M 0, hm (at) = 

Thus, the limit of a difference, product, or quotient of two functions is the 
corresponding difference, product, or quotient of their respective limits ► 


Example 9 The squaring function g(x ) — x 2 is the product of the identity 
function I(x ) = x times itself It follows from Theorem 2 and Example 7 that 


lim x 2 


|^lim 7(*yj j/im /(jc)J 


= P P = P‘ 


► 

112 LIMITS / 281 



PROBLEMS 


2 


Define the function g by 


g(x) = 



Find lim I _, 4 g(x) 


when x 4 
when x = 4 


We wish to be able to find the limits of sums, differences, products, and 
quotients of functions We diverge from this problem to look at the defini- 
tions of these algebraic combinations of arbitrary functions We will see 
that these algebraic combinations of functions arc defined in the same way 
as for sequences Thus, if/ and g are two real valued functions we define for 
all points x common to the domains of both/ and g 

(a) the sumf + g of/ and g by 

if + g)(*) = fix) + g(x) 

(b) the difference f - g of/ and g by 

(/ “ £)(*) = f{x) - g(x) 

(c) the product f g of/ and g by 


if S)(x) = fix) gix) 

(d) the quotient of/ and g for gix) ^ 0 by 

*' U Six) 


then n l T rea ‘ VaIUed functl0ns fix) = 2** and 8 W 

Hence the’ {l for a " x wh,Ie S is defined only for - 1 < x < 1 

defined onl^; d 'f P^cl and qnot.ent of these two fnncTions are 
_ or a point x in this common domain, we have 


(/+ «)w ■ 

(f-SXx)- 
(S -/)M : 
(/ g)(.x) - 



w- 



ifx ^ ±1 


if* M 0 


A - been 

deleted sentences of pents ,n the denta^f £%££%££££ 


280 / FUNCTIONS 


limits and continuity 



ing total revenue /* 0 = f(s 0 ), the average revenue per sales unit is 


A(s 0 ) = 

s o 

represented in Figure 7 by the slope of the line OP. As total sales increase 
from s 0 to S\, so that revenue increases from r 0 = /(s 0 ) to rj == f(s i), the 
average rate of increase is 

n - r 0 = /(sO - /fro) 

Si — So Si — So 

represented by the slope of the line PQ. 

If we now think of keeping point P fixed and moving Q toward P along the 
curve, the average rate (2) can in the limit be interpreted as the “instanta- 
neous” rate of increase at sq. It is this limit which defines marginal revenue. 
(Geometrically, this limit defines the slope of a line tangent to the graph of 
f at s 0 .) 

To take a specific case, suppose the total revenue (in millions) is given in 
terms of total sales (in thousands) by 

[8S when 0 < S < 1 

P-/(S) = { 

(12 - (4/S 2 ) when 5 > 1 

(see Figure 8). Then the marginal revenue at the point (2, 1 1) is 


FIGURE 8 
R 



(S thousands) 


11.2 LIMITS / 283 



Theorem 1 and Theorem 2(u) may be extended by induction to the sum 
and product of any number of functions The proof of the following theorem 
is left as an exercise 

Theorem 3 Let p be a point common to the domain of the functions 
/i»/ 2 » >/n> and suppose that for k = 1, 2, 3, , n, 

lim f k (x) = L k 

x—P 

Then 

(0 lim [/ito + f 2 (x) + + /„(x>] = L t + L 2 + + L n 

fu) lim l fi(x)/i(x > /„(x)l = L,L 2 L n ^ 

x-*p 

Example 10 A polynomial is a function of the form 

/(x) = c 0 + exx + c 2 x 2 + + c n x n 

which is produced by adding the products of constant functions (c,) and 
identity function products (x*) According to Theorem 3, we have 

lim/(x) = lim c 0 + [ lim Cjl T lim xl + [ lim c 2 l [ hm xl 2 

x-tp x~>p L X->p J Lx-»p J lx—p J Li ~»P J 

+ + [lim Cnjj^lim xj n 

= Co + Cip + c 2 p 2 + + c n p n 

= /(p) 

The limit of a polynomial/ at p is just the functional value f(p) ► 

Example 1 1 An important concept in economic theory is that of marginal 
revenue, or the rate at which revenue increases relative to sales Consider a 
firm whose total revenue R is related to total sales S by a function f, called 
the revenue function (see Figure 7) For a given total sales Jo and correspond- 


FIGURE 7 



282 / FUNCTIONS LIMITS AND CONTINUITY 



P = g(x) 

where p denotes price and x is the quantity demanded. Define the total 
revenue function by 

*(*) = x-g(x) 

If hm x _*o g(x) = 7, find the marginal revenue at x = 0. Interpret your 
results. 

14. Let / and g be defined by /(x) = 2x — 6 and g(x) = 4x + 2. Define 
the following functions and indicate the domain and range of each. 

(a) / + g (b ) f - g (c) / • g 

(d) f/g (e) g/f 

15. In Problem 14 Iet/(x) = x + 3 and g(x) = x 2 + 5a: + 6 

16. Prove Theorem 2. 

17. Prove Theorem 3. 

18. Prove that if Jim^a f (x) = a, then for any constant A, 

hm A/(x) = k • cl 

19. Prove that if hm x _, p f(x) = A and li m x -. p f(x) - 3, then A =* B 

20. Prove that if lim*_, g{x) - L, then lim x _ p te(x) — L] — 0 

21. Find two functions / and g such that neither has a limit at x ~ 0 but 
whose sum / + g has a limit at that point. 

22. Prove that iflim x ^ p g(x) = L, then hm x _,o g(p -f- x) ~ L 


11.3 MORE ABOUT LIMITS 

A sequence (xj, x 2 , x 3 , . . ) is said to diverge to infinity if for every positive 
number m, no matter how large, there is an integer k such that x n > m when 
n > k. That is, a sequence diverges to infinity if, by going out far enough in 
the sequence (past the kth term) we can find terms as large as we wish 
(larger than m). Similarly (xi, x 2 , x 3 , — ) diverges to minus infinity if for 
every negative number m, no matter how large, there is an integer k such 
that x„ < m when n > k For instance, the sequences (1, 2, 3, 4, . . ) and 
(1,4,9, 16, . . ) diverge to infinity, while (—1, —2, —3, —4, . . ) diverges 
to minus infinity (see Figure 9) 

FIGURE 9 


1 i 1 

* 1=1 *2 = 2 *3 = 3 

divergence to infinity 

If for every sequence of numbers (x } , x 2 , x 3 , ) which diverges to in- 

finity the corresponding sequence of functional values 

(/(*l),/(*2),/0C3). • • •) 

113 MORE ABOUT LIMITS / 285 




lim 

5— *2 


AS) -/( 2) 

5 - 2 


hm [12 ~ (4/S 2 )] - [12 - (4/2 2 )] 

S-2 


, S 2 - 4 

S=(S - 2) 


lim 
s— 2 


5+2 

52 


= 1 


Thus, if it were possible to increase sales by one unit under the conditions 
which prevail when 5=2, revenue would also increase by one unit (Geo- 
metrically, at 5 = 2 the line tangent to the revenue function inclines 45° to 
the horizontal ) ^ 


PROBLEMS 

In Problems 3-10 assume that the domain of the function is the entire real 
line excluding points where the denominator is zero Compute the indicated 
limits 


3. lim 


s( x+ ;) 


5. lim i-jJ 
*-.o x + 3 

7 1 x 2 -4 
1. hm — — - 
— 2 X+2 


8. hm 

A— *0 


9. lim 


(I + hf - 1 
h 

x 2 + 3x- 10 


4. lira (x 2 + 2x + 3) 

- 4 

5. lim — 

— x + 2 

(Hint x 2 - 4 - (* _ 2)(x + 2) ) 

(Hint There is a factor of h in the numerator ) 


-2 * 2 — 6x + 8 
11. Let 


10 . tog+ tf-2 1 

A— »o h 


= \ x + 2 when * < 0 

l-x + 1 when x > 0 

Which of the following hum stalcmems are correct’ 

(a) iim/(jc) = 2 ,,, , 

*— o (b) Iim/(x) = 1 

(c) Iim/(v) =* 3 , 

4-1 (d) lim f( x ) = 2 

(c) lim f{x) = 0 ’ 


12. Suppose that the cost function for a commodity ,s 

C = 10 + 2x 

«here * is the amount produced Show that the marginal cost ,s 
for all , MC ~ 2 

13. Suppose that the demand function for a commoduy is 
284 / FUN CTl 


ONS. LIMITS. AND CONTINUITY 



N q is the population size at some arbitrary initial point t Q in time, N(t) de- 
notes population size at time t > t 0> 0 is a constant growth rate, and e is a 
mathematical constant approximately equal to 2.718. Obviously, for any 
time sequence ( t n ) diverging to infinity, the corresponding sequence (N(t n )) 
also diverges to infinity. Thus 

lim N(t) = co ► 

(—♦00 


PROBLEMS 

1. Find 

2. Prove that 


3. Find the following limits. 

1 


(a) lim 


x—*2 (X ~ 2)2 


(c) lim - 

X—ttC 

(e) lim 


3 

(x - 2)2 
6 

» (* ~ 4) s 


lim 


3 (X - 3)2 


lim = 

z_0 X 2 


(b) lim 


(d) lim 
i— >o 

(f) lim 


x2 - 4 

M 


x + 4 


4. In Problem 7, Section 11.1 assume that n can be any positive number. 
Find hm„_ >oo /0i). 


5. Suppose, in a population of N individuals, that the average number of 
individuals who have completed a particular task by time t is 

Ait) = Nil - 2'“‘) 

where a is a positive constant. Find lim*-,* A(i). 


6. Given certain assumptions about the nature of the items added, the 
correlation* between two tests X and Y of altered length is 


Ria, b ) = 


Vri R*? 

Vi + («"- 1 )R X W+lj> - l)R v 


where R xu , R x , and R u are, respectively, the correlation between the un- 
lengthened tests, the reliability of test X , and the reliability of test Y. 
The quantities a and b represent the lengthening factors of the two tests. 

(a) For each fixed b, find Fib ) = lima-,* Rio, b). 

(b) Find linv.,* Fib). 


7. Work out details for Examples 2 and 3. 

8. Find lim,-,* [(x 2 + 1) 1/2 — x]. 

9. Prove that lim*-,* [\/x(x + a) — x] = a/2. 


•Adapted from Gulliksen, H. A., Theory of Mental Tests (John Wiley & Sons, Inc , New 
York, 1950) 


11.3 MORE ABOUT LIMITS / 287 



converges and has limit L, then L is called the limit of f(x) as x approaches 
inf in ty and we write 

lira. f{x) ~ L 

The symbol x — > co is read “x approaches infinity ” It is intended to mean 
only that x increases without bound 

Similarly, if the sequence (f(x 1 ),f(x 2 ),f(x s ), ) converges to L when- 
ever (*!, x 2 , *3, ) diverges to minus infinity, then L is the limit off(x ) as 

x approaches minus infinity , and we write 

lira fix) = L 


Example 1 If/(x) = l/x, then lim^./fx) = 0 To see this, let (x u x 2 , 
x 3 , ) be any sequence diverging to infinity and let r be a positive number 

Then 1 /r is positive and there is an integer k such that 

x n > - > 0 when r > k 

Equivalently, 

0 < ~ < r when r > k 

so that the sequence (l/x„) of functional values must converge to zero ► 


Examp'n 2 For the revenue function in Example 11 of Section 11 2 we 

]mt/(S)= l,m [12 -( 4 /^)] 

= 12 - hm (4/S 2 ) 

= 12 - 0 = 12 ► 


If for every p-deleted sequence lx, x y v 
quence of functional values (/(*.) f(x, ) \‘ he corr “pondmg se- 

write u ^ J f (*3), ) diverges to infinity, we 

lim/(x) = oo 

oSrarXs 

hm/(x) = 

^r^he^erbal'defi'mtioi/and^r^not 1 to^be consiclerecf ^numbers S ^ 0r ^ ant ^ 


Example 3 A simple model of world population growth is 
•'V(') = N 0 e’“-'°> 

Z86 / FUNCTIONS. LIMITS. AND CONTINUITY 



g(x) = g(p) Hence 


hm 

x-tp 



(x) = Iim 

x->P 

lirrij 


/(*) 

Six) 

->pfix) 


g(x) 


_ f(p) 
s(p) 



[Theorem 2(m)J 


► 


Intuitively, if p is a point of continuity of a function /, then / cannot take 
a jump at the point p For if every sequence of functional values 
fi x 2 ),fixz), ) corresponding to a ^-deleted sequence (jcj, x 2 ,x 3 , ) 

must converge to /(/?), then for values of x near p,f(x) must be near f(p) 
For example, each of the functions in Example 3 is discontinuous at zero 
where the graph takes a jump (see Figures 4 and 6) A good rule of thumb for 
intuitively determining if a function is continuous is to see if its graph can be 
traced without lifting pencil from paper Such considerations indicate that 
the following theorems must be true The proofs will be omitted 


Theorem 5 Maximum and Minimum Value Theorem Suppose a func- 
tion / is continuous throughout the interval [a, b] (See Section 6 3 for inter- 
val notation ) Then / has a minimum value m and a maximum value M in 
the interval That is, there are numbers a and /3 between a and b such that 
m = / (a) and M ~ f (/3), and such that for all x in the interval, the condition 

m <f(x)<M 

is satisfied (see Figure 10) ^ 

Theorem 6 Intermediate Value Theorem Suppose a function f is con- 
tinuous throughout the interval [a, b] If N is any number between f{a) and 


FIGURE 10 



114 CONTINUOUS FUNCTIONS / 289 


11 4 CONTINUOUS FUNCTIONS 


A function/ is said to be continuous at a point p in its domain if the limit of 
fix) as x approaches p is the functional value f ip), that is, if 

Um/(x) = /(p) 

x~*p 

If/ is continuous at each point in a set S, it is said to be continuous on S 


Example 1 Our calculations in Example 3 of Section 1 1 2 show that 
g(x) = 2x - 3 

is continuous at x = 4 Actually this function is continuous at every point 
For if p is any real number and (xj, x 2 , X3, } is ap deleted sequence, then 

hm (g(x n )) = lim (2x„ - 3) 

= 2 lim (x„) — 3 
= 2p-3 

The limit is equal to the functional value at p 


E **T!° 1 In EXaraP ' e 10 ° f SecUon 11 2 ' we ‘hat for any poly- 
normal/ and any point p , the limit statement 

hm fix) = f(p) 

X-+P 

holds Thus polynomials are continuous at all points 

noTcZmuou 2 f “ 2 - 

t.nuous at every other point /() d s not “ ,st Ho ™ver, / is con- 

(b) The function h in Example 4 of Section 1 1 t 
because the limit hm „ h(x\ - 1 ^ 11 2 1S not continuous at zero 

"(0)= 2 “ aI v t 

responding for hmuTstaTed fr ° m 

Then the funcUonsT+g f ^ l '°"^ and « continuous at the point p 
function//,: IS continuous'.f gfp) * 0 f * con,muous at P Thi: 

“ Smcc f°and 0n ' y ,hC T S “ Cmcnt - lcjv '"8 th = other cases for the 

a nc c/and g arc continuousatp, we have lim,_ p /(x) = /(/i)and 

283 / FUNCTIONS. tlMITS. AND CONTINUITY 



is, a price which results in the maximization of profits To find this optimum 
price, we complete the square in (5) to obtain 


;,ooo „ ( 230V 

3 — 30 ^--rj 


Thus the maximum profit is P m = 214,000/3 = 71,333a at the optimum 
pnce po - 230/3 = 76§ ► 


PROBLEMS 


1 Find the following limits and determine in which cases the function is 
continuous at the point at which the limit is taken 


(a) hm (x — 2x -+■ 4) 


(b) hm ( x 

x _*0 


■2x+ 4) 


(c) lim (x — 2x -f 4), where a is any real number 

z - 16 


(d) hm |/( 

f -.0 

(0 hm fix), where 
*—,—1 


fix) . 


(e) Jim - 


x* 5 4- 5* + 4 
x+1 


if X 9* ■ 
if X = ■ 


2 Each of the following functions fails to be continuous at the point p 
specified Why 9 

(a) 

-• ~ - when x ^ If 

at p — 0 

when x = 0 


/w 


■(f 


(b) 


g(0 


2 , 

t + t - 

.0 


(C) 


Afe) » 


.4 - 


when / ^ 2 
when t = 2 
when r < 1 
when r > I 


at p = 2 


at p =» 1 


3 For each of the following functions 0 is not in the domain If possible, 
assign a >aluc /( 0) so that /is continuous at 0 If this is not possible, 
indicate why not 


/to - i 


(c) /(*) “ sin I 


(c) /(*) ■ 


(b)/(t) - ± 

«>/(*) - it si 
I 


114 CONTINUOUS FUNCTIONS / 231 



f(b), there is at least one point p between a and b at which 
/(/)) = N 

(see Figure 10) y 


Example 4 The estimated sales volume 5 of a certain firm is related to 
the per unit price p by the demand equation 

S = 4000 - 30 p 

The total cost C incurred in the produclion of S units is 
C = 25,000 + 205 

(3) 

= 105,000 - 600 p 
It follows that the total revenue is 


R=pS= 4000 p - 30 Ip" 

and that the net profit is 


(4) 


P - R - C = 4600p - 30p 2 - 105,000 (5) 

the T pr,c C e°ner n u d n, r r enUe fl “ CtIOnS (3) “ d (4) are graphed in Figure 11 As 
first increases re r reaSeS ’ ‘° tal “ St StCad,ly dccreascs 'vh.le total revenue 
off eaSeS ' reachmg a "«““»■ ef >33,333 atp = 66 J, and then drops 

the net profit'corresnrfH* re8 '° n above a particular per unit price p indicates 
contmuous ^ Theo^ s 8 ° ,hat pnre S,Dra ,he Profit funcUon (5) is 

guarantees the existence of an optimum price, that 

figure 11 


C or R 




uous at x = p, then 


TO 0 

is also continuous at p 

14. A function / is called additive if 

fix + y) =/(*) +f(y) 

for all real x and y 

(a) Give an example of an additive function {Hint Try a first degree 
polynomial ) 

(b) Give an example of a nonadditive function (Hint Try poly- 
nomials ) 

(c) Prove that if / is additive, then f(0) = 0 (Hint 0 = 0 + 0) 

(d) Prove that if an additive function is continuous at x = 0, then it is 
continuous at all points on the real line (Use Problem 11) 

15. A real valued function g is called multiplicative if 

g(x + y) = g(x) g(y) 

for all real x and y 

(a) Give an example of a multiplicative function (Hint Recall the 
rules for exponents ) 

(b) Give an example of a nonmultiplicative function 

(c) Show that if g is multiplicative, then either g(0) = 1 or #(0) = 0 

(d) Show that if g(Q) = 0, then g(x) — 0 for all x 

(e) Show that if g is continuous at x = 0, then g is continuous at all 
points 

16. Give an example of a function F for which 

(a) F(x >) = F(x) F(y ) for all x and y, 

(b) F(x y) = F(x) + F(y) for all x and y 

17. If F is a function whose domain includes the number zero and is such 
that 

F(x>) = F(x) + F(y), 
prove that F is the constant function F(x) = 0 

18. (a) A function /is said to be an even function if whenever x and —x are 

in its domain, f(—x) = /( x) Give an example of an even function 
(b) A function / is said to be an odd function if whenever —x and x are 
in its domain,/(— x) = —fix) Give an example of an odd function 

SUPPLEMENTARY READING 

Anderson, K. W , and Hall, D W , Sets, Setjucnccs and Mappings (John Wiky & 
Sons, Inc , New York., 1963), Chapter 5 

Good, R A , Introduction to Matl emahes (Harcourt, Brace and World, New York, 
1966), Chapter 9 

SUPPLEMENTARY READING / 233 


1 

f(x) - g(x) 



4. Which of the following functions have points of discontinuity, that is, 
points at which they are not continuous 9 

(a) fix) = * > 0 


(b) 


fix) - 


2 COS —Z if X J* 0 
X 2 


(c) 


six) = 


if X ■■ 


if x > 0 
if* < 0 


(d) 


hit) = 


(e) 


fix)- 


[/+ 1 
/ - 1 
lo 

: - 4 
- 2 


if t > 0 
if / < 0 
if t - 0 

x * 2 
x =■ 2 


5 Prove that any constant function is continuous 

6. Prove that iff and g are both continuous at the point p, then their sum 
f + g is also continuous at p 

7. Prove that the difference / — g between two continuous functions /and 
g is itself a continuous function 

8. Prove that the product of two continuous functions is a continuous 
function 

9 Usemduction to extend to any finite number of functions the statements 
of Theorem 4 concerning sums and products 

10 f^ V l that lf/lS coimnuous on the interval from a to 4, and if /(a) and 
Hb) have opposite signs, there is a point c between a and 4 at which 
/(c) ■= 0 

11. Prove that 

< a > I/(o + 4) -/(„)] = 0 

CM = /( a ) 

<C) */(“) 

are all equivalent to saying that/is continuous at the point a 

12 ' inf that a quotlem oi two polynomials is continuous except where 
me denominator is zero 

I3 * Srfi that 3nd have the same domain and that fix) > gix) 
x m the com mon domain, show that if fix) and g(x) are contin- 

292 / FUNCTIO 


NS. LIMITS. AND CONTINUITY 



FIGURE 2 


0 P t =g (t x ) P2 = s(tl)^g(h+ h ) 


measured from some arbitrary point called zero. If you pass point p\ (Fig- 
ure 2) at time /j, and point p 2 at time t 2 = t\ + h, your average speed in 
the time interval from t\ to t 2 is 


Pi — P\ _ gfa) ~ gQi) _ g(ti + /i) ~ g(t l) 
h — h h — t\ h 


This average speed should, as h grows smaller, become a better and better 
estimate of the actual speed at p\. In fact, it makes a great deal of sense to 
use the limit 


lim 

h—0 


gQi + h) - gpo 
h 


(2) 


as the definition of actual speed at p\. 

To take a specific case, suppose g(t) = t 2 for all / > 0. Then the average 
speed in the time interval from t\ to t\ + /r is 

g(*i + li) — g(*i) _ (h + h ) 2 — t\ __ ~ , , 

h h 11 

Hence the instantaneous speed at / = t\ is 

lim (2/i + li) = 2 t l (3) 

A— »0 

Table 1 lists average speeds m some time intervals beginning at / x = 2 and 
shows how the average speed gets closer to 2/i = 4 as the length of the time 
interval shrinks. 


TABLE 1. The convergence of average speed to instantaneous speed at 
time t, = 2 tabulated for g(t ) = t 2 


Initial 

time 

/, = 2 

Final time 
h — h *b h 

= 2 + /i 

Length 

of 

time h 

Position at 
time t 2 

g(t 2 ) - (2 + A) 2 

Change m 
position 

slh) - s('i) 

= 4A + A 2 

Average 

speed 

4 + A 

2 

10 

8 

100 

96 

12 

2 

7 

5 

49 

45 

9 

2 

4 

2 

16 

12 

6 

2 

3 

1 

9 

5 

5 

2 

25 

05 

6 25 

2 25 

45 

2 

23 

03 

5 29 

1 29 

43 

2 

21 

01 

4 41 

0 41 

4 1 

2 

2 01 

0 01 

4 0401 

0 0401 

4 01 

2 

2 001 

0 001 

4 004001 

0 004001 

4 001 


121 THE DERIVATIVE / 295 






DIFFERENTIAL CALCULUS 12 


121 THE DERIVATIVE 

all i!r dnVe y ° Ur Car> the Speed at wh,ch you are traveling is indicated at 
v'rr; V T spe ' doraeter But Patrolman Smith, sad to say, cannot see 
H, s ef an , d m “ st resort 10 mdlrect means of checking your speed 

tance mart 1 IS e- “he stretches two cables across the road a fixed dis- 
the distance " d electro " lcal| y measures the time required for you to cover 
which eauals ^1"“" t ^ e "' In thls » a y he determines the average speed, 
betweenthe d ' V ' ded by “ llme •” a "d, assuming that your speed 

actual speed ” 1S COnstant ’ bc uses this figure as his best estimate of your 

and p 2 a distance^ me,tlod Tb e cables are stretched across at points p 1 
separated sav a nf _ P ' apatt (see Flgure 0 ,f these P 01nts are wldely 
times exceed the speedTimu aT ^ ™ y conslderab| y and eve " 
allowable limits Smularc ' ? ^ y °“ r averag = s P ccd ma y be wel1 wlth ‘ n 
distance ,s reduced Put, C ° nSld ? ra,, °" s a PPly. b ut to a lesser extent, as the 
ficulty of nrecKf* 108 3S,de practical considerations involving the dif- 
your “mstantanerti f Surei ” ents of tlI «e and distance, the best estimate of 
pass point Pl , ,s obtainedbv' 1 ? Speed show,n S on your speedometer as you 
To translate nn r ki y p l ac,n g the point u 2 as close as possible to />i 

g IS a function whoseralue^rtTatr' 11 ™ 3111131 la " BUage - let US supp0Se “'a* 
g(r) at time t represents your position on the road, 


294 / DIFFERENTIAL CALCULUS 



PROBLEMS 

2. According to Equation (6), the slope of/(x) = jc 2 is 2( — 1) = —2 when 
x — — 1. Draw a sketch like Figure 3 and make a table like Table 1, 
using the same h values, to show how the slope of the line PQ converges 
to —2 as It — > 0, where P is the point (—1, 1) on the curve. 

If we compare the slope of the tangent line (Equations (4), (5), and (6)) 
with Patrolman Smith’s problem (Equations (1), (2), and (3)), we find some 
striking resemblances. In each case we have begun with a function whose 
values represent the amount of a certain quantity (distance or height, as in 
the examples above). We have in Equations (1) and (4) computed an average 
rate of change of this quantity, and we have then, by taking limits in Equa- 
tions (2) and (5), arrived at an instantaneous rate of change. 


Definition 1 The process of moving from a function / to an instantaneous 
rate of change is called differentiation. The quantity 


f(x + h)~f(x) 
h 

is called the derivative of the function f at the point x in its domain. 


lim - 

A— *0 


(7) 


Differentiation forms the basis of one of the two major branches of cal- 
culus — the differential calculus. The derivative (7) is denoted by /'( a) or by 
Df(x ). A commonly used alternative notation is to write y = f(x), in 
which case the derivative is denoted by 

dx 

This is called the Leibniz* notation for the derivative. Each of the three nota- 
tions has its own advantages in different contexts, and the three forms are 
used interchangeably. 


Example 1 Consider a commodity, such as gasoline, which may be pro- 
duced in any non-negative quantity x. Let C be a cost function whose value 
C(x) is the total cost associated with output le\el x. Then the marginal cost 
at level x is defined to be the derivative C'(x). 

To take a specific case, suppose C(a) = ax — b.x 2 , where a and b arc 
constants. Then the marginal cost at level .v is 


C'(.v) = lim 

A—O 

= lim 

A— <0 


C(x + h) - C( a) 
h 

a ( a -h h) — b(x -f h) 2 — (ox - bx 2 ) 
h 


ax -f- ah — bx' — 2 bxh — b/f — ax 4- bx' 

lim 

A—0 “ 

lint (a — 2bx — b/i) = a ~ 2 b.x 


► 


A—0 

‘After Cournot Wilhelm Lcibmz <16^6-1716). Calculus uncntcJ in the concurrent 
uorks of Leibnu and Isaac No* ton. 


12.1 THE DERIVATIVE / 237 



problems 

1. According to Equation (3), the instantaneous speed at time t\ = 3 is 
2(3) = 6 Make a table like Table 1 showing how the average speed 
over a time interval beginning at t — 3 and ending at / = 3 + /i con- 
verges to 6 as h — * 0 

Now let us look at another problem In the sketch of the function/ shown 
in Figure 3, the slope of the line joining the points P , with coordinates 
(a, /(a)), and Q, with coordinates ( b,f(b )), is computed by dividing the dif- 
ference f(b) — f(a ) in the heights of the curve at P and Q by the horizontal 
distance b — a between these two points Writing b — a — h, we have 

slope PQ = m-m _ / (« + / 0z/ ( «) (4) 

F * b - a h 


FIGURE 3 



wc now think of point P as being fixed and of moving Q along the curve 
toward P it seems reasonable to expect the slope of PQ to get closer and 
J* ° th r e S °. PC °! the lmc tan S cnt to the graph of/ at point P In fact, 
at P) to be the limit ^ ° f ^ tangent Ime ( or > e£ l uiva lently, the slope of/ 


slope of /at P = h m ffa + ti) — f(a) (5) 

A— »o ll 

Tor example, if/(\) = x for all x, then the slope of/at x = a is 

hm ° + '•} -z lisl _ (a + A ) 2 - a 2 
'• i™ A 

= lim (2 a + A) = 2a < 6 > 

A— »0 

296 / DIFFERENTIAL CALCULUS 



Rashevsky assumes that the numbers of passives change over time accord- 
ing to the equations 

ax , . 

— = ax - by + c 

dy 

-Z = fly-ax + V 

The positive constants a and /3 represent the influence of passives having the 
same attitude, the constants b and a indicate the influence of passives hold- 
ing the opposing view, while c and 7 mirror the constant influence of the 
actives in the respective groups ^ 

Example 5 Homans* expounds a theory of group behavior based on 
four quantities /, the intensity of interaction among group members, F the 
level of friendliness among group members. A, the amount of activity by 
group members, and E , the amount of activity imposed on group members 
by external environmental forces Each of these quantities represents an 
average over the group members and each may change with time 
The verbal treatment of Homans was translated by Simonf into the math- 
ematical relations 

/(/) = a\F(t) + a 2 A{t) 

F’it) = wo) - mo] 

A’(t) - Ctl/FQ) - 7/1(0] + c 2 [£(0 - -4(f)] 

where all constants are assumed positive The first relation indicates that in- 
teraction increases with the level of friendliness and/or amount of activity 
earned on within a group The second states that the rate at which the 
amount of friendliness changes depends upon the disparity between the pres- 
ent levels of interaction and friendliness, this rate being positive when inter- 
action is high and friendliness low, and negative in the reverse case The 
third relation indicates that activity will increase when the levels of friend- 
liness and external stimulation are high relative to the present level of activity 
and will decrease in the opposite situation 

PROBLEMS 

In Problems 3-8, translate the verbal statements into equivalent mathematical 
equations 

3. The time-rale of increase of a population is proportional to the popula- 
tion sue iV(/) 

4. The rate of adoption of a technological innovation is proportional to the 
number of people <V„(/) who have adopted the innovation at tunc t 
Units the numbtr of people *V W (/) who have not adopted the innovation 

•Homans, G C Ike Hunwt Grot p (Hareourt, Brace A. VVoilJ, Inc . New Volk, 19S)) 
t Sunon H A \U\UhvJ W-/» (John VV i!c> A Sons. Inc , New York. 1957), Cleiptcr 6 


121 THE DERIVATIVE / 233 



Example 2 Other uses of the derivative m economics are the following 

(a) If / is a production function which relates the output level x of some 
commodity with the amount z of input of labor (or raw material, or 
capital), that is, if x = /(z), then the derivative dx/dz = /'(z) is 
called the marginal product 

(b) If p is a demand function whose value p(x) at output level x specifies 
the price which will lead to the sale of x units of output, then r(x) = 
x p(x) is the revenue which the firm obtains from the sale of x units 
The derivative r'(x) is called marginal tevenue 

(c) If u is a utility function which attaches utility u(x) to output level x, 

then u'(x) is the marginal utility at the point x ► 


Example 3 Richardson* adapts models for the spread of epidemics to the 
study of the diffusion of “war fever” among a population Let pit) denote 
the proportion of the population favorable to war at time t so that 1 — pit) 
is the proportion not favorable at that time In the interval from time t to 
time t + h, the proportion of the population converted to a prowar attitude 
is pit + /i) — p(t) The average rate of conversion is [pit -f- h) — pii)]/K 
while the derivative 

p'U) = hm + 

A-*0 h 

represents the instantaneous rate of conversion 
In the simplest model, Richardson assumes that the instantaneous spread 
of war fever is proportional to both the number of people having prowar 
sentiments and to the number who are antiwar That is, 


P'U) = kp(i)[ I - p(i)] m 

* h ' rc „ 4 “ a p “ smve c °nstant This implies that the rate of change p'U) 15 
tr*n * W Cn 1S Cl ° Se to 0 or to and ,s greatest when pit) is near \ 
qua ions such as (8) which involve a function and its derivative are called 
Chapter 15 ^ Uat,0nS Methods of solving such equations are discussed in 


divided mf aS evs ^^ m °del of mass behavior applies to a population 
parties WiM W ° gr ° Ups '^ and Y > for example, by allegiance to two political 
un cac group there are a fixed number of “actives” Xo n nd -Vo 
fcall thei'^r 1UnS t0 attltude c ^ange The remaining members of the groups 

nu^&ra^vL nd '> are ’ and subJKt ,0 "" 

Richardson, L F , War Moods," Pijchomelnka 13, 147-174 (1948) 

4, *221-239 093*9)* m lhC Malhcmalrcal Theory of Human Relalions," Ps>chomc- 

238 / DIFFERENTIAL CALCULUS 



(b) If L = 80, a =» 20, and c — 5, what would be the value of / (x) at 
x — 25 if the initial rate were maintained 7 

(c) How does this compare with the predicted learning performance 
/(25)*> 


12 2 DERIVATIVES OF SUMS, PRODUCTS, 
AND QUOTIENTS 


The simplest results about derivatives are intuitively obvious from the inter- 
pretation of the derivative as a slope For instance, any constant function 
has zero slope and the derivative reflects this If f(x) = c for all x (see 
Figure 4), then f(x+ ft) = f(x) = c, and we have 


Df(x) — lim 
h-> o 


fix + It) -f(x) 
h 


c - c 
lim — . — 

h .0 It 


lim 0 = 0 

A — >0 


FIGURE 4 (a) A constant function has zero slope, (b) the identity func- 

tion has slope 1 




Similarly, we know from geometrical considerations that the identity func- 
tion /(x) = x, whose graph is a straight line through the origin inclined 45° 
from the horizontal, must have slope I (see Figure 4) Using Equation (7) 
to obtain the derivative of / gives 


Dl(x) = lim 


/(x 4- It) - H a) = 


= lim 


(\ + Jt) — x 


for every real number a These results arc both special cases of the following 
theorem 


Theorem 1 Let n be a non-negatne integer and define the function g by 
&(v) = v n Then Dg{x) = nx n ~ l 



5. Marginal revenue per unit varies with the square of the production 
level x 


6. The level of difficulty D(t ) of learning at time t decreases at a rate pro- 
portional to the product of the level of difficulty at time t and the total 
amount p(i) of practice since the beginning of formal practice 

7. The rate at which an organization expends energy over time is propor- 
tional to the difference between the progress p(t) of the organization at 
time t and the anticipated progress p(t) 

8. The time rate of change of the number N of excited neural circuits is 
proportional to the average frequency of the impulses less a proportion 
of the number of excited neural circuits 

/ n _ P n°n lemS 9 ~ 12 ’ ftnd ‘ he lncrement /(* + h) - m corresponding to 
i - 0 02 and x = 4 Also find the average rate of change of fm the interval 
from x = 4 to x = 4 02 


9-f (*)-* + 2 10. /(*) _ 2x 2 - 4 X + 3 

U. /(r) = x > - 2 a: 2 + 3* - 6 12. /(x) = 2/(x + 2) 

indicated funchon 15 ‘ 3 ~‘ 7 ’ “ ' he (7) t0 find the deOTa,lve ° fthe 

13 /(*) - 2x - x* ,4. _ ix 3 _ ix 2 + x + ^2 

15. „(,)=! + 3, + 4,= 16. /i(x) = 4x(2 — x) 

17. i(i) = (/ - I)(, + 2 ) 

18 Find the slope of each of the following curves 

(a) y = x- - 2* at x =■ 3 

(b) ) - Jt-t + Sx + 4 at x = 1 

19. Suppose that total cost for a firm ,s given by the function 
15 6 + 4x + 3x 2 

he amount produced Show that the marginal cost function is 
MC(x) 4 4. r lx 

conditions K nrooorli™ . rcqu ' red 10 S1 °P an automobile under normal 
when “ 0UtK «*** <‘P“ d > 11 d “ 18 

u == 60 change of d when e — 30 and also when 


21. Thurstonc* has developed the learning curve 


fix) = 

m c ,s 

(a) Had ,hc slope of ,he line langen, ,o/at v - 0 
•Thurstonc, L L. ‘Th*Lc. r 

un / Ps>rhot > K> UohnNIuSt^T n-pnntcd m C3 A Mitler, Mathematics 
Ky * 5ons ‘ lnc • N «-w York. 19W), pp 128-132 

300 / differential CALCULUS 



For (b), we have 

Dim + gw] = hm £k L± M ±iki±^ L zm i ±m a 
= hm z/W + llm s (_ X ±h)- g ( X ) 

h-> o “ fc-»o « 

= £>/(*) + Dg (x) 

Finally, (c) follows by induction from (a) and (b) ► 

PROBLEMS 

2. Use mathematical induction to establish Part (c) of Theorem 2 

Theorems l and 2 together enable us to compute the derivative of any 
polynomial, as in the following example 

Example 1 (a) lff(x) = 7x 12 — 29x 2 -f 3 x ~ 6, then 

Df(x ) = D[7x 12 - 29x 2 -F 3x - 6] 

=* D(7x 12 ) + D(-29x 2 ) + DQx) -F D(-6) [Theorem 2(c)] 
= 7 Dx 12 - 29 Dx 2 + 3Dx + D(- 6) [Theorem 2(a)] 

= 7 • 12x* 1 — 29 2x + 3 1 -f- 0 [Theorem 1] 

= 84* n - 58* -f 3 

(b) The derivative of g(/) = — 7 / 3 -F 21/ — 2 is 
g'( 0 = D[-7t* + 21/ - 2] 

= Z)(— 7/ 3 ) + ^>(21/) + £>( — 2) 

= -7Z >/ 3 + 210/ + 0 
= — 7(3/ 2 ) + 21(1) 

= — 21 /“ + 21 

When / — 4, the derivative is g'(4) = — 21 (4 2 ) + 21 = —315, while at 
t = l,g'(i) = -2I(I a )+ 21 = 0 ► 

Example 2 Ncifeld and Poffenbcrger* found, for / > 0, that the empirical 
equation „ „ 

3(0 = Oo ~F Q\t *F on t~ -F 03 / 

where Oo» Qi» <* 2 . and 03 are constants, described the relationship between 
amount y(r) of work output and elapsed time / The rate of change of work 
output at time / is therefore 

>'(/) = Oi *F 2 ant + 3 o 3 /- ► 

* NnfcJd M K and Poflinbiricr. A T, “A Mathematical Anal>sjs of Work Curves,’* 
Journal of General Psychology 1, 44S— 456 ( 192S) 


122 SUMS. PRODUCTS. AND QUOTIENTS / 303 



proof We have already proved the theorem for constant functions (n 0) 
and the identity function (,. = 1) For,, > 2, we use the binomial expansion 

(x + /,)" = x " + nx"~ l h + 5 »(» - IK' 2 * 2 + + l,n 

to obtain 

„ . (x + !i) n - x n 

OSW = D* = j™- h 

, [x" + + jn(n — l)*-y + + A” 1 — *" 

- lira j 

K— 0 ' 

Canceling the terms involving x n and then dividing by h gives 

Dx" = lim 0w"-‘ + in(n - l)x’- 2 /. + 4- /i" _1 ) = nx n ~ l ► 

h—O 

If n is a positive integer, the derivative of x 71 is /ix”" 1 For instance, the 
derivatives of x 4 , x 27 and x 2 are, respectively, 4x 3 , 27x 2G , and 2x 


PROBLEMS 

1 Find the derivatives of x 6 , x 1 and x 3 


Since differentiation involves a limiting operation, it is natural to expect 
that properties of derivatives should parallel those of limits in general 
Theorem 2 states the most important results 


Theorem 2 (a) If the derivative D/(x) of the function / exists at the 
point x and if c is any constant, then 

D[cf(x)] = cDf(x ) 

The derivative of a constant times a function is the constant times the deriva- 
tive of the function 
(b) If Df(x) and Dg(x) exist, then 

D[(f + g)(x)] = Df(x) + Dg(x) 


The derivative of the sum of two functions is the sum of the respective 
derivatives 

(c) The results in (a) and (b) may be extended by induction to 
£t°i/i(x) + a 2 f 2 (x) + 4- fl n / n (x)] 

= + a 2 Df 2 (x) + + a n Df n (x) 

whenever the derivatives Df i(x), £>A(x), , Df n (x) exist and a u a 2 , 

a n arc constants 

PROOF For (a), we have 

2>[r/M] - t,m c li*±3^isrM. = , f(x + n -/(* ) 

' " « n h 


n h~* o 

= c hm £*+-!•} -SM 
X-0 h 


cD/(x) 


302 / DIFFERENTIAL CALCULU 



For (b), we have 

Z)[/ W + *(*)] = hm f S x - + Jl + -^ x + * L ~- 

h-> 0 « 

ft-»o « h-*o h 

= £/(*) + Dg (x) 

Finally, (c) follows by induction from (a) and (b) ^ 

PROBLEMS 

2. Use mathematical induction to establish Part (c) of Theorem 2 

Theorems 1 and 2 together enable us to compute the derivative of any 
polynomial, as m the following example 

Example 1 (a) lff(x) = 7x 12 - 29x 2 + 3x - 6, then 

Df(x) = D\7x 12 ~ 29x 2 + 3x - 6] 

= Z)(7x 12 ) + £>(— 29x 2 ) + D(3x ) + D(- 6) [Theorem 2(c)] 
= 7 Dx l 2 - 29Dx 2 + 3 Ox + Z>(-6) [Theorem 2(a)] 

= 7 12.x 11 — 29 2x + 3 1 + 0 [Theorem 1] 

= 84x 11 - 58x + 3 

(b) The derivative of g(/) = — 7t 3 + 21/ — 2 is 
g'(t) - D[-7 z 3 + 21/ - 2] 

= D(~7t 3 ) + D(2U) + £(-2) 

= —7 Dt 3 + 21 Dt + 0 
= — 7(3/ 2 ) + 21(1) 

- - 21/ 2 + 21 

When / = 4, the derivative is g'(4) = — 2 1 (4 2 ) + 21 = —315, while at 

' = l.g'(l) = — 2 i (1 2 ) + 21 =0 ► 

Example 2 Neifcld and Poflenbcrger* found, for / > 0, that the empirical 
equation „ _ 

}(l) = fl 0 + a l l + a 2 l ‘ + 

where a 0t a t , a», and a 3 are constants, described the relationship bUvvt.cn 
amount >(/) of work output and elapsed time / The rate of change of work 
output at time / is therefore 

+(/) = a 1 + 2a ^ + 3a 3 /- ► 

•Nctfclil M R and Rotruiberscr, A T **A Mathematical Ana!>sis of Work Curves, ’ 
Jounul of General Psychology 1, 44S-456 (192S) 


122 SUMS. PRODUCTS. AND QUOTIENTS / 303 



problems 

3 Calculate derivatives of 

(a) /to - 4x 2 - 2 

(b) g(0 - -3t G + 4/" 1 - 2f + 7 

(c) /f(u) — l> 8 — u 

Although the derivative of a sum is the sum of the derivatives, it is not 
true that the derivative of a product is the product of the individual deriva- 
tives For example, x 3 can be written as the product of x 2 and v, but the 
derivative Dx 3 = 3x 2 is not the product of Dx 2 = 2x and Dx = 1. 

Theorem 3 If the functions f and g both have derivatives at a point x, 
then the derivative of the product / g at x is 

D[f(x) g(x)] = f(x)Dg(x) + g(x)Z)/(x) 

Thus, the derivative of the product of two functions is a sum, each term of 
which is the product of one of the functions times the derivative of the other 

proof By definition 

Dim = l.m + + 

h-o It 

Rewriting the difference f(x + %(.* + A) - /(jc)g(jc) as 

/(* + A)[g(Y + A) - g«] + + It) - /(x)] 

gives 

TO gtol = [lim/(x + A)jJjim S(X + ^ ~ 

+ g(x)lm L(x + /0-fM 

h-. o h 

The second and third of the indicated limits are, of course, the respective 
derivatives Dg{x) and Df(x) As for the first, we have 

lira/Or + h)= lira [/(*) + /, (tt * + A) 

= fix) + [lira /r] [lira /(*+A)~/ £0j 

= fix) + 0 Df(x) = f(x) (9) 

This completes the proof ► 

For future reference we note that Equation (9), together with Problem 1 1 
of Section 11 4, constitutes a proof of the following useful result 

304 / DIFFERENTIAL CALCULUS 



Theorem 4 If a function has a derivative at a point a in its domain, then 
it must be continuous at a ► 


Example 3 (a) Dx 3 = x 2 Dx + xDx 2 = x 2 I + x 2 x - 3x 2 

(b) Since D(ix~ - 4x) = 6x - 4 and D(-6x 5 - 2x 2 + 1) = -30x 4 - 
4.v, we have 

DlQx 2 - 4x)(-6x 5 - 2x 2 + 1)] 

= (3x 2 - 4x)(— 30x 4 - 4x) + (-6x 5 - 2x 2 + l)(6x - 4) 

= -126x° + 144x 5 - 24x 3 + 24x 2 + 6x - 4 ► 

Theorem 5 If the functions f and g both have derivatives at a point x 
and if g(x ) 0, then the derivative of the quotient f /g at x is 

n [ /■ M l g(x)Df (x) - f(x)Dg(x) (10 , 

l_£(*)J [g(x)p 

Thus, the derivative of the quotient of two functions is found by multiplying 
the denominator by the derivative of the numerator, subtracting the product 
of the numerator and the derivative of the denominator, and then dividing 
by the square of the denominator 

proof By definition 

J/Wl = , 1/(X + h)/s(x + /!)] - LfM/gM) 

Lg(x)J h 

The numerator [f ( x + /i)/ g(x + h)] — [/ (x)/g(x)] may be rewritten as 
f(x -f /i)g(x) - f(x)g(x -f />) 
g(x)g(x + ft) 

_ g(x)lf(x + A) -/(x)] -/(x)fg(x + ft) - g(x)] 
gWg(x + ft) 

Dividing by ft and taking limits as A — > 0 yields the result (10) (Remember, 
g is continuous so g(x + A) — > g(x) ) ► 


Example 4 


D 


("3x 2 - 2v + ll 
L 6x» - 2 J 


_ (6X 4 - 2)D(3x 2 - 2x + 1) - (3 t 2 - 2x + 1 )D(6x' - 2) 
(6xi _ 2 )- 

_ (6.V 1 - 2)(6x - 2) - (3x 2 - 2v + l>(24x- 3 ) 

(6x‘ - 2)- 

— 36.V 5 4- 36v 4 — 24x 3 — I2v 4-4 w 

(6x‘ - 2 )- 


122 SUMS. PRODUCTS AND QUOTIENTS / 305 



Using Theorem 5 we 6nd that Theorem 1 is also valid when n is a negative 
integer°For if n is negative, then -a is positive and 


Z>(x") = D 




x-'D(\) ■ 


■ D(,x- n ) 


(.v-") 2 


x~’(0) - (,-n)x " 1 


[since — n is positive] 


Example 5 (a) D(l/x 7 ) ~ D(x ~ 7 ) = ~lx 8 
(b) D[6x 4 - 2x 3 + 3 - lx- 2 + x -3 ] = 24x 3 - 4x + 4x~ 3 - 3x _1 . 


Example 6 Carzo and Yanouzas* postulate that for a person whose pre- 
determined goal level is g > 0 and whose perceived justice per unit of re- 
ward is a > 0, satisfaction S is related to total reward R by 


S(*) = 


aR 

g-R 


The rate of change of satisfaction relative to reward is 


o, m (g-R) a-aR(- 1) 
s(s) 


ag 

(g ~ RY 


Thus, the rate of change of satisfaction is inversely proportional to the 
square of the difference between the personal goal of the individual and the 
amount of reward received ► 


Example 7 In a model of intercity migration, Galle and Taeuberf postu- 
late that the number Y of migrants from city A to city B during a fixed time 
interval is given by 7(C) = K/O where C is the number of migrants com- 
peting for opportunities in city B and K and a are constants If for illustrative 
purposes we assume a = 10, then the rate of change of Y relative to C is 
Z)7(C) = —10/C/C 11 ► 

PROBLEMS 

In Problems 4-15, find the derivative of the given function 
4. s(l) - 3I 1 + 2 1 3 - 6/ 2 + 10 5. w(z) =. 4z + Jz 4 + V2 

6. T(x) = (x - l)(x 2 + 2) 7. „( r ) = (r + 3)(r 2 + 2) 

♦Carzo, R , and Yanouzas, J N , Formal Organization A Systems Approach (Irwin, Home 
wood, III , 1967), p 489 

tGalle, O R , and Taeuber, K E , “Metropolitan Migration and Intervening Oppor- 
tunities, American Sociological Reciew 31, 5 13 (1966) 


306 / DIFFERENTIAL CALCULUS 



8. s( t) = 2f (l + 1) 
x - 1 
x + 1 

(, _ ])(,= + ,) 


10. >'(.v) = 
12. >(r) = 


14. q(p ) ■■ 


fi+ 1 

3 i 
P ~ 1 


9. >p(z) = z’(z + 2)(z - ]) 
3x 


11. z(x) = 


13 /(x) = 


x + 2 
x 2 4- 10 


x-f 1 


15. «(r) = 4r + ^ + 


r 2 (r + 2)3 


Find the value of the derivative at the point indicated for each of the follow- 
ing functions 


16. y = (2* + 1)C* 2 + 4) at .v = 2 

17. = at x = I 

2 ~ x 

/ 2 

18 «(f) — — - at / = a 

t 2 + a- 


19. y(x) = (x 3 + l)(x + 2) at x = 1 

20 For each of the following functions, find the value of x for which 

/'(*> - o 

(a) /(x) - 2x 2 - 3x + 1 (b) fix) = jj— 

21. Prove that 

Dlfix) g(x)l Dfjx) Dg(x ) 
fix) six) fix) f six) 


12 3 MEAN VALUE THEOREM 

For both theory and application, the most important result concerning de- 
rivatives is the Mean Value Theorem The simplest version of this theorem 
is called Rolle’s Theorem 

Rofle's Theorem Suppose the function /has a derivative at each point 
in an interval [ a , b]* and that f(a) = /(6) = 0 (see Figure 5) Then there 
is at least one point c between a and b at which the derivative of/ is zero 

PROOF Geometrically the statement is obvious What goes up and comes 
down must level off at some point A formal proof is not difficult There are 
three cases to consider 

(i) The values of / are all zero In this cas e /'(■*) = 0 for all * between 
a and b and we arbitrarily take c = (a + b)/2 
(h) There are points at which /is positive By Theorem 5 of Chapter 1 1, 
there is a point c between a and b at which / takes its largest value 

"Recall (Section 6 3) that if a < b, ihc notation [a 6] indicates ihe set {x a < x < b) of 
all points on the number line l>mg between a and b, including a and b themselves 


123 MEAN VALUE THEOREM / 307 



FIGURE 5 



That i $,/(*) </(c ) for all x in the interval Now /'(c) is defined by 

/'(c) = hm L( c + l>)-f(c) 
h— o h 

Since /(c + h) <f(c), these ratios are nonpositive when h > 0 and 
non negative when A < 0 At the same time, the fact that the limit 
exists means that for small values of A, either positive or negative, 

SLtKfcS 10 that " m “ 0bv,ously ' 15 the ° n,y pos - 

(U0 ne h LweT"/ 1S " eV '[ POS,t,ve but there ™ points at which it ,s 
vafue An h ‘ S ““ ‘ here 15 a Pl,lnt c at which / takes its smallest 
alue An argument similar to that m („) shows that here also 


possesses ^denvahve^ea^pomUnTh 

n Point m the interval is critical In Figure 6, for 



308 / DIFFERENTIA 


L CALCULUS 



instance, the function / takes the value zero at a and at b , but there is no 
point between a and b at which the derivative is zero. This is because the 
function has no derivative at the point c where / takes its maximum value. 


Example 1 The function /(x) = a* 2 + •* — 2 is zero at x = 1 and at 
x — — 2 (see Figure 7). Its derivati\e 

f'ix) - 2.x + 1 

iszeroat.v= 

FIGURE 7 


/ 



1. The function fix) — x 3 — 4x is zero at x = 0, x = 2, and x ~ —2 
At what point(s) is its derivative zero? 

Mean Value Theorem Suppose the real-valued function /has a deriva- 
tive at every point in the interval [a, 6] Then there is at least one point c be- 
tween a and b at which 

/'(c) = (11) 

PROOF A geometric interpretation is shown in Figure 8. The ratio 

fib) — f (a) 
b — a 

is the slope of the line between the points (a,/(n)) and (6, fib)) on the 
graph of / The slope of/ at a point c is/'(c) and the theorem simply asserts 
that there is at least one point c between a and b at which the slope of the 
curve is the same as the slope of the line. 

The proof is an easy consequence of Rolle’s Theorem. The function / 
need not satisfy the assumptions of that theorem (since fib) and /(a) may 

123 MEAN VALUE THEOREM / 309 



FIGURE 8 



not be zero), but the function 

g M = m - m - /( v" ~ (x ~ fl) 

obtained by subtracting points on the line from corresponding points off, 
docs Specifically, 

*(«) = m - m - & b) b (a - a) = 0 

g(b) = /(6) - /(a) - (*-«)- 0 

and the function g has a derivative if / does In fact, 

g . ix)=ru) _m^m (i2) 

Applying Rolle’s Theorem to Equation (12) completes the proof ► 

Example 2 If r(x) is the revenue obtained by a firm from the sale of x 
units of output, the average revenue is r(x) = r(x)/x Assuming that zero 
revenue is obtained from zero output (that is, r(0) = 0), this may be re- 
written as f(x) = [/•(*) — r(f))]/(x — 0) According to the Mean Value 
Theorem, there is a production level x* < x at which the marginal revenue 
r*(x*) is equal to the average revenue r(x) ^ 

PROBLEMS 

2. Draw a graph like Figure 8 to illustrate Example 2 

3. For the function fix) = x 3 , find the point c between a = 0 and b <= 1 
at which /'(c) « lf(l) - /(0)]/(l - 0) 

A function / is said to be increasing in an interval if / (x) < f(y) whenever 
x and >• jre points in the interval and x < y Similarly,/ is decreasing in an 


310 / DIFFERENTIAL CALCULUS 



interval if, whenever x and y are in the interval and x < y> then /(x) > f(y) 
(see Figure 9). The Mean Value Theorem enables us to determine where a 
function is increasing or decreasing by looking at its derivative. Specifically, 
if the derivative is positive throughout an interval, the function must be 
increasing in that interval, while if the derivative is negative, the function 
must be decreasing. 

FIGURE 9 (a) An Increasing function on [a, 6]; (b) a decreasing function 

on ]a. b]. 




To see this, take a < z < y < b and suppose that /'(x ) > 0 for all x 
between a and b (Figure 10). From the Mean Value Theorem we know that 
there is a point c between z and y for which 


or, equivalently, 


f(y) -m 
y — z 


= /'C0 


f(y ) - /CO = f'(c)(y - z) 


Since f'(c ) > 0 and y > z, it follows that f(y ) exceeds f(z). But z and y 
were any two points between a and b , so we conclude that f is increasing 
throughout the interval. 


FIGURE 10 

E , 

a z c y b 

A similar argument may be used to establish the fact that a function de- 
creases in any interval in which its derivative is negative. Together, these 
results provide a simple method for sketching the graph of a function. 


Example 3 The derivative of 

fix ) = x 3 - 3x 2 + 4 
is 


f'ix) * 3x 2 - 6x = 3x(x - 2) 


12.3 MEAN VALUE THEOREM / 311 



When x < 0, both 3x and x - 2 are negative so that /'(*), being the product 
of two negative numbers is positive For x between 0 and 2, 3x is positive 
but x — 2 is negative, so /'(x) is negative Finally, for x > 2 ,/'(x) is positive 
(see Figure 11) It follows that /(x) is increasing when x < 0 and when 
x > 2, but decreasing when 0<x<2 At x = 0 and x = 2, the slope of 
/ is zero 

FIGURE 11 

+ - + 

1 1 

0 2 

Putting this information together with the fact that /( 0) = 0 3 — 3(0) 2 4* 
4=4 and /( 2) = 2 3 — 3(2 2 ) 4- 4 = 0 yields Figure 12 as the graph of the 
function Note carefully how positive, negative, and zero values of /' are 
reflected in the graph ► 

figure 12 


/ 



Example 4 Suppose that a political candidate assumes that the number 
of \otes he will receive is related to the amount of money that he spends on 
his campaign He postulates that when x thousands of dollars are expended, 
the plurality P(x) of votes that he will receive is given for x > 0 by 

P(x) = Jx 3 - 4x 2 4- 12x 

The derivative 

P'(v) = x 2 - 8x 4- 12 = (x — 2)(x - 6) 

312 / DIFFERENTIAL CALCULUS 



is positive when 0 < x < 2 and when x > 6, and negative when 2 < x < 6 
A graph of P, increasing for x < 2 and x > 6, and decreasing between 2 
and 6, is shown in Figure 13 ► 

FIGURE 13 


P 



Example 5* Let C(x) denote the cost of producing x units of some com- 
modity, so that C(x) = C(x)/x is the average cost per unit Writing 

C(x) = x CM 

and differentiating gives 

C'(x) = xC?'(x) + C?(x) or xC'(x) = C'(x) — C(x) 

Recalling (Example 1 of Section 12 1) that C'(x ) is the marginal cost at 
production level x, we see that in order for average cost to rise (that js, for 
C?'(x) > 0), the marginal cost C'(x) must exceed the average cost (7(x) 
Similarly, if marginal and average costs are equal, then average cost remains 
constant, while marginal cost must be less than average cost in order for 
average cost to decline ► 

PROBLEMS 

4. For each of the following functions, find where the graph increases, 
where it decreases, and draw a sketch 

(a) /(x) = 3x 3 - 2x~ 

(b) g(i) = & + It + 4 

(c) /*(=> = 2 z 3 - 3z- + 6r - 6 

•Adapted from Baumol NV J Economic Theory and Operations Analysis (Prentice Hall, 
Inc , Englewood Cliffs N J . 1965), 2nd cd 


123 MEAN VALUE THEOREM / 313 



5. Use the Mean. Value Theorem to prove that a function must be decreas- 
' mg in any interval in which its derivative is negative 
6 In what intervals are the following functions increasing 9 decreasing 9 
Sketch the graph of each function 


(a) /(*)-2x-3 

(b) g(0 = t 2 — 2t + 6 ( Hint Complete the square ) 


(c) h(z) ■- 


-3 lfz < -4 

- 2z + 6 ifz > -4 

(d) f(y) - [>], where [y] denotes the largest integer less than or equal to 
y (this is referred to as the bracket function) 

7. Check that the hypotheses of Rolle’s Theorem are satisfied by each of 
the following functions on the designated interval and find a point c 
such that /'(c) =* 0 

(a) fix) = x 3 - x [-1, 1] (b) g(x) - Ax - 2x 2 [0, 2] 

8. Show by example that the point c in Rolle’s Theorem need not be unique 

9. Prove that if the function / has a derivative at each point in the interval 
[a, 6] and if /(a) = fib), then there exists a point c G [a, b\ such that 
/'(c) - 0 

10 Check that the hypotheses of Problem 9 are satisfied and find a point c 
in the specified interval such that /'(c) = 0 

(a) f{x) - x 2 (-1, 1] (b) fix) = x 3 - * [-1, 1] 

(c) fix) - |2x - 1| [-1,2] 


11. Prove that if/'(x) = 0 for all x G [c, 6], then / is a constant function 
over [a, b] iHint Apply the Mean Value Theorem ) 

12. Prove that if / and g are functions such that fix) = g'ix) for every 
x € [a 6], then there is a constant c such that fix) - gix) + c iHint 
Use Problem 1 1 ) 


12 4 COMPOSITE FUNCTIONS AND 
THE CHAIN RULE 

In our previous discussion (Chapter 4) we have visualized a function, such 
as /;(*) => (2x — 3) 2 , as a system or machine which accepts an input x and 
produces a corresponding output /i(x) = (2x — 3) 2 as in Figure 14 For 
some purposes, however, it is convenient to think of h as a two-stage sys- 
tem in which the input x is first transformed, by means of a function /, into 
the quantity fix) — 2x — 3 This quantity becomes the input of a func- 
tion g and is then squared (see Figure 15) Symbolically, if fix) = 2x - 3 
and g0») - y 3 then 

Kx) = gl/(x)l =» gi 2x - 3) = (2.x - 3) 2 
In general, if/ has domain A and range B, and g has a domain which 
contains B, the function h defined for each a e A by 

A(«) » g[/(a)] 


314 / DIFFERENTIAL CALCULU 



FIGURE 14 


input x 

sh. 

function h 


output (2x— 3) 2 


FIGURE 15 

input x 



has domain A and range C contained in the range of g (see Figure 16) 
Wc shall call h the composite mapping of g n ith f and write h = g°f The 
function h is sometimes referred to as the “composition of g with /” The 
order of composition is important For instance, in our first example where 


124 COMPOSITE FUNCTIONS / 315 




FIGURE 16 



/(x) = 2x — 3 and g(j>) = y 2 , we have 

(g = g[/0)] = g(2x - 3) 

= (2x - 3) 2 = 4x 2 - I2x + 9 

while 

(/og)W=/[g(x)]=/(x 2 ) 

= 2x 2 - 3 

Example 1 Suppose that total cost is given by the function 
f(x) «= 4x 2 + 4x + 1 = (2x + l) 2 
where x is the amount produced If we let 


then 


/»(*) = 2x + 1 and g(y) = y 2 

/ to = (g 0 /»)(*) 


► 


The theorem which tells how to differentiate a composite function is 
called the Cham Rule 

Chain Rule Suppose the real-valued functions f and g have continuous 
derivatives throughout their domains Then the derivative of the composite 
function /i(v) = (go/)(v) a t the point a is given by the product 

/>'(«) = S' l/(o)l /'(«) H3) 

In words, the derivative of h at the point a is the product of the derivative of 
g at the point/(a) times the derivative of /at the point a 
PROOF By definition 

v<«) - '£ «+. » - Msl _ |„„ zuu, + .)] - am i (i4) 

i— « l « — »o { 

Aecordmg to the Mean Value Theorem, there exists a point u between f(fl) 
ind/(a + t) at which 

/(" + 0] - gL/(«)l = s' 00I/(" + /) -/(«)] < 15 > 

316 / DIFFERENTIAL CALCULUS 



(16) 


Substituting (15) into (14) we obtain 

h'(a) = [inn £-(«)] [imt 

The second of these limits is, by definition, f'(a ) To evaluate lim t — 0 g'00> 
we note first that hm f _o/(a + 0“ f( a ) slnce / 1S continuous (see Theo- 
rem 4) Since u lies between f(a) and f{a -f- /), as in Figure 17, it follows that 
lim f ^ 0 « — /( fl )» also Hence, using the assumed continuity of the deriva- 
tive g', we obtain 

hm g'(«) = Iim g'00 - g'(f(a)) 07) 

i—o «— Ao) 

Inserting (17) into (16) yields the desired result (13) ► 

FIGURE 17 


H i i — 

f(a) u f(a+t) 


It is important to remember that the three derivatives in (13) are not all 
evaluated at the same point The derivatives of h and of / are evaluated at a, 
while the derivative of g is evaluated at / (a) 

The Leibniz notation (see Section 12 1) provides an easily remembered 
format for the Cham Rule Let y = /(x) and z — g(y) = g(/(x)) Then 
the Chain Rule (13) takes the form 


dz __ dz dy 
dx~ dy dx 


(18) 


The formula works as though the derivatives were ratios of quantities dx, dy, 
and dz which could be treated just as ordinary numbers 


Example 2 (a) To differentiate (x 4 — 3x4- 2) 14 , let/(x) = x 4 — 3x 4- 2 
andg(y) = y lA Theng'Cy) = 14y 13 and/'(x) = 4x 3 — 3 Taking^ = /(x) 
we have 


D(x 4 - 3x + 2) 14 = I4(x 4 - 3x -f 2) 13 (4x 3 - 3) 


(b) To differentiate [x 2 4- (1/x 3 )] 3 , let y = fix) = x 2 + (1/x 3 ) and 
- = 5 O’) = y~ z = lx 2 + (l/x 3 )r 3 Then 


dz 

dx 


dz 

dy 


~3 y~\2x - 3v -4 ) = 


’(* + ^) 


Both parts of Example 2 arc special cases of the following important result 

Theorem 6 Suppose n is an integer and let /i(x) = [/(x)] n , where /is a 
differentiable function Then h has a derivative given by 

/!'(*) = nl/(x)]-“ */'(«) 

12.4 COMPOSITE FUNCTIONS / 317 




PROOF In the Chain Rule (13), let gOO - y n Then g'(y) = ny" 1 and 
we have 

A'(x) = g , (J(x))f'(x) = n\f(x)Y~ X nx) 
as asserted ^ 


Example 3 (a) The derivative of 

/i(x) = (3x 4 - x 2 + I) -2 is -2(3x 4 - x 2 + l)- 3 (12x 3 - 2x) 


(b) The derivative of [(x 2 — l)/(2x + 6)] 7 is 


[x 2 - if 

(2x + 6)2x - (x 2 - 1)2 

7(x 2 - 1) 6 (2x 2 + 12x + 2) 

Lix + 6j 

(2x + 6) 2 

(2* + 6)8 


(c) The derivative of (3x — 4) 2 (— x 3 + 3x + 2) 4 is 
(3x - 4) 2 [4(— x 3 + 3x -f 2) 3 (-3x 2 + 3)] 

+ ( — x 3 + 3x + 2) 4 [2(3 x - 4) 1 3] 
- 6(3x - 4)(— x 3 + 3x + 2) 3 [2(3 x - 4)(-x 2 + 1) + (-x 3 + 3x + 2)] 
= 6(3x - 4)(— x 3 + 3x + 2) 3 [-7 x 3 + 8x 2 + 9x - 6] ► 


We have established that the formula Dx n = ;ix n_1 is valid for all inte- 
gers n whether positive, negative, or zero Using Theorem 6 we can show 
that this formula is also valid if n is a rational number To see this, let 
/(x) = x plq , where p and q are integers Then [/(x)] 3 = x p and differenti- 
ating both sides of this equation we find 


tf/tol*- 1 /' (x) - px v ~ * 


Solving for /'(x) yields 


fix) = 


Q [/(x)l« -1 q [*p/9]®-i 


. P x (»/«)-i 
<? 


Example 4 (a) Dy/x = Dx l ‘ 2 = x x ~ 1 ' 2 

(h) D -1= = Sr 1,! = 
vx- 1 


(c) D 


( x- - i y 
W + 2x + 1 ) 


3 ( x 2 - 1 \ ~'l< 

Ax’ + 2.i+ \) 


[ fx 2 + 2x+ l)2x - (x 2 - l)(3.xr- + 2) 
L (x 3 + 2x + 1 )- 


318 / DIFFERENTIAL CALCULUS 



PROBLEMS 


Find derivatives of each of the following functions The letters a , b, and c 
denote constants 


!./(*)- (x z ~2* + 3 Y 
3. S(u) = - ■ — “ ~^ T 7 

5. i/(z) = (1 - 2z) 3 ' 2 
1 


7. SCO = 
9. g(x) = 
11. jCO = 


(f 2 - 5)2 


12 + jt 
4I 2 -31+1 
%/6 + 5/2 


2. g(r) = Vl - t 

4. /i(x) = v'O + x2)(l - 4x) 


8. y(x) = ax -f- bx z -f- c 


10. /(»v) = (w - 2) + 


1 


(w - 2) 5 


12. /i(x) = Vl - (1 + x) 1 ** 


13. If Q is a demand function relating quantity Q(p ) and pnce p, the elas- 
ticity of demand is defined as 


E{p) 


DQ(P) 

Q(p) 

dp 

p 


Q\p ) 

QCp) 


Prove that elasticity of demand E(p) is unity if Q(p) = cfp for some 
constant c 

14. Prove that if demand is inelastic (that is, E(p ) < 0), then a rise in price 
will increase consumer demand while if demand is elastic ( E(p ) > 0), 
then a fall in price will increase demand 

15. Sometimes the order of composition of two functions is immaterial 
If /(*) = x and g(y) = y 2 show that for all x, (/ o #)(*) = (go f)(x) 

16. Let f(x ) *= x 2 and g(>) = Vy Argue that (/ ° g)(x) is defined only for 
x > 0, while (g°f)(x) is defined for all x e R Hence, as sets of ordered 
pairs, go/and fog are different functions, although, when both are de- 
fined, they take the same values 

17. Prove that the operation of composing functions is associative That is, 
prove that ho (go/) = (h © g) of 


12.5 DERIVATIVES OF TRIGONOMETRIC FUNCTIONS 

In this section we continue our discussion of differentiation by considering 
the trigonometric functions, of which the sine and the cosine arc the most 
important The derivatives of the sine and cosine functions are obtained in 
Theorem S below. In order to prove Theorem S, though, we need first to 

•The minus sij^i is a convenience lo make cUsticit> non negative since demand Q(p) 
nornull> decreases as/> increases 


12.5 TRIGONOMETRIC FUNCTIONS / 313 



know a bit more about limits The proof of our first theorem is quite simple 
and is left as an exercise (compare with Problem 6 of Section 5 3) 

Theorem 7 (a) Suppose that lim x _*/(x) and lim,_ p g(x) both exist and 
that for some positive number r, the inequality 

fix) < g(x) 

holds throughout the interval [p — r, p + r] Then 
lim fix ) < lim g(x) 

*->p x—*p 

(b) Suppose that lun x _»*/fcr) and h m z -. P g{x) exist and are equal and that 
for some positive number r, the inequality 

fix) < h(x) < gix) 

holds throughout the interval [p — r,p + r] Then lim z _,j, h(x) exists and 
lim fix) = lim /i(x) = lim g(x) ^ 

x-*p x->p x~*p 

From elementary trigonometry we know that if the line emanating from 
the origin and passing through the point (x, y) makes an angle 6 with the 
positive x axis, then 

„ x 
cos 0 - - 
r 
v 

sin e = - 
r 

tan Q = - 
x 

where r == V* 2 + y 2 is the distance between (0, 0) and (x, y) (see Fig- 
ure 18(a)) 

Thus in Figure 18(b) the lengths of lines PQ and TR are, respectively, sin 0 
and tan 9 Since the radius of the circle is unity, the number 9 represents the 
length of the arc PR as well as the area of the sector OPR Comparing lengths 


FIGURE 18 




320 / DIFFERENTIAL CALCULUS 



of PQ and PR shows we must have 

sm 0 < 0 (19) 

at least for values of 0 which lie between 0 and t/2 Moreover, using the fact 
that sin (—0) = — sm 0, we find 

— 9 < sin d < 9 (20) 


whenever —tt/2 < 9 < t/2 

Since lim^_o 0 = 0, so that lim*_o (—0) = — hm^_ 0 0 = 0 also, it fol- 
lows from Theorem 7(b) and Equation (20) that 


lim sin 0 = 0 (21 ) 

6—0 

Applying Equation (21) to the identity sin 2 0 -f- cos 2 0 = 1, we obtain 

lim cos 2 0=1 — lim sin 2 0=1 
e—o e—o 

Thus 

lim cos 0 = 1 (22) 

9-0 


since the cosine function has only positive values near 0 = 0 Since sin 0 = 0 
and cos 0=1, Equations (21) and (22) imply that the sine and cosine func- 
tions are continuous at 0 = 0 Comparing areas of OPR and OTR and 
using (19), we find for 0 < 6 < t/2 that 


sm 0 < 0 < tan 0 


Dividing by sin 0 gives 


* < sin 9 < cos ( 


(23) 


Since tan (—0) = —tan 0 we also have, for —t/2 < 0 < 0, 
sin 0 > 0 > tan 0 

If we divide by sm 0 (which is negative for these values of 0) we again obtain 

1 


1 < n < 

sin0 cos0 


Since hm 3 cos 0 = 1, wc find, combining (23) and (24), that 


lim = 1 

4-a sm 0 


(24) 


(25) 


Writing 

cos h — 1 cos 3 h — 


-sin* h sm A 


A(cos /; Hr 1 ) A(cos h 4* 1 ) h 


( —sin/; \ 
\cos /; + \) 


12 S TRIGONOMETRIC FUNCTIONS / 321 



and using Equations (21), (22), and (25), we see that 

cos h — 1 f • sin /A { , — sin/i 


lim - 


■ fa t) fa =37 


t) 


( 26 ) 


A standard trigonometric formula, which holds for all values of a and 0, is 
sin (a + 0) = sin a cos 0 + cos a sin 0 ( 27 ) 

If we let x = a + 0 then x — » a as 0 — * 0 Thus, taking limits in (27) we find 
lim sin x — lim sin ( a + 0) 

x— a 0—0 


= (sin a) lim cos 0 + (cos a ) lim sin 0 
0—0 0-0 


= (sin a) I + (cos a) 0 
= sin a 

In the language of Section 11 4, this shows that the sine function is con - 
tinuous at every point 


PROBLEMS 

1 Use the trigonometric formulas 

cos (0 + a) = cos 0 cos a — sin 0 sin a 


tan0 


sin 0 
cos 0 


to show that the cosine function is continuous at all points, while tan 0 
is continuous except where cos 0 — 0 


Formulas (25) and (26) enable us to find derivatives of the basic trigo- 
nometric functions 


Theorem 8 The derivative of the sine function is the cosine function The 
derivative of the cosine function is the negative of the sine function Sym- 
bolically, 

D sin x = cos x 
D cosx = — sinx 

PROOF By definition, 

J S mx=l,m- s ‘Afa+ 8 >- smJC 

0—o 0 

Since sin (x + 0) ® sin x cos 0 + cos x sin 0, this may be rewritten as 

£ sin x = lim fsin x — 6 ' ~ 1 + cos x 

0 —o L 0 0 J 


322 / DIFFERENTIAL CALCULUS 



Using (25) and (26) gives the desired result, namely D sin x = cos x. 
For the derivative of the cosine function, we obtain 


cos (x + 0) — cos x 

D cos x — lim 

0->O v 


lim 

0—0 


cos x cos 9 — sin x sm 6 — cos x 


, ... cos d — 1 . . ... sin t 

= (cos x) lim - ( sin *) hm — — 

a-*o o e~>o V 


Example 1 The interplay between the sine and cosine functions is readily 
apparent from the graphs in Figure 19. The sine function is increasing when 
its derivative (the cosine) is positive, and decreasing when the cosine is nega- 
tive. The cosine function increases when the sine [unction is negative (that 
is, when its derivative, which is the negative of the sine function, is positive), 
and decreases when the sine function is positive. ► 


FIGURE 19 




12.5 TRIGONOMETRIC FUNCTIONS / 323 







Since all other trigonometric functions can be expressed as quotients in- 
volving the sine and cosine, their derivatives can be obtained by using 
Theorem 8 in conjunction with Theorem 5 For instance, 


D tan x 


D sin * _ (cosx)(cosx) — (sinx)(— sinx) 

COS X COS 2 X 


1 

COS 2 X 


sec 2 x 


In a similar manner we find D cot x = — esc 2 x, D sec x = sec x tan x, and 
Dcscx «= —esc vcotx The proofs will be left as an exercise Table 2, 
Section 12 7, includes these trigonometric differentiation formulas along 
with other differentiation formulas 


Example 2 (a) To differentiate ft(x) = cos (2x + 3), let /(x) = 2x + 3 
and g(y) = cos y so that /i(x) = g(/(x)) Applying the Chain Rule gives 


Decs (2a + 3) = A'(x) = g’(f(x))f'(x) 

= -sin (2x + 3) 2 = -2 sin (2x + 3) 


(b) To differentiate /i(x) = sin (x 2 - 1) 3/2 , let y = fix) = (x 2 — 1) 3/2 
and z ta g(y) » sin y - sin (x 2 - 1) 3/2 Then 

DI 'M = % - % % - (cos,) J(* 2 - !)>/» 2* 

= 3*(x 2 - l) l,2 cos(x 2 - l) 3 ' 2 

(c) 

D [ cos (2 V 1 ’ — 1)] 5/2 

= i[cos(2x w2 - 1 )] 3/2 [ — sin (2 jc 1/2 - 1)][2 i* -1 ' 2 ] 
= -J»- ,,2 [cos (2JT 1 ' 2 - l)] 3,2 sm(2*'' 2 - 1) 

(d) 


D sin 




- 3)(2x + 1) - (x 2 + x)3x- 
(*’ - 3)~ 


x + 2x z + 6 v + 3 

(*’-3)5 c 




(<■) 


d [g!g.( ill 

L v + tan 3\ J 

= L ± un3>][cos(i 2 - I)]2a - [sin _ !)][) + 3 scc 2 3v] w 
I* + tan 3*]- 


324 / DIFFERENTIAL CALCULUS 



PROBLEMS 

2. Find derivatives of 

(a) cos ( t — l) 2 (b) tan (~x 3 4- 2) 

(c) [sin (2x)][cos (x — 1) 3/2 J 

3. Use Theorem 8 and the relations cotx = cosx/sinx, secx = J/cosx, 
and esc x == 1/sin x to derive formulas for derivatives of the cotan- 
gent, secant, and cosecant functions. 

Example 3 In a study of the beetle pyrophorus, Kropp* found that the 
average angle of orientation a at which the beetle climbed an inclined plane 
was related to 0, the angle of incline of the plane, according to the function 

a(0) = JCsinG + C K> 0 

where C and K are constants. The rate of change of a with respect to 6 is thus 
a'(0 ) — K cos 0 ► 


PROBLEM S 

In Problems 4-17, find the derivative of the given function. 

4. / (x) = tan 2x 
6. g(/) = (sin 2/)(cos 2 3 1 ) 

8. f(z) = z/sm 4z 
10. /(/) = /-cos 3/ 

12. f(\i>) S w + cot w 
14. f(u) = sec 2u tan it 
16. / (/) = esc (4/ — 2) 

18. Sketch the graphs of 

(a) sin (x/2) (b) cos (1/x) (c) cos (2x 4- t) 

19. Prove that 

(a) D sin ax = a sin (ax 4- tt/2) 

(b) D cos ax = a cos (ax 4- tt/2) 

(Hint cos (a -f £) =3= cos a cos/3 — sin a sin#) 


5. h(y) = y tan y 2 
7. g(x) - tan 2 (2x 2 ) 

9. f(x) — cos (2x — 4) 
II. f(w) = v'sirTw 
13. f(x) = x tan 2 x 
15. /(x) = x 2 /cos 2 x 
17. /(x) = tan (v 4* l) 1 ' 2 


12.6 INVERSE FUNCTIONS— THE INVERSE 
TRIGONOMETRIC FUNCTIONS 

A function /is said to be one-one, or hi, on a set A if no two points in A 
have the same image; that is, if f(x) ^ f(y ) whenever .v 6 A, y e A, and 
\ * y. For instance, if a A 0, the function g(\) = a.\ -f- $ is 1-1 on the 

•Kropp, B. "Gcouopic Oucnution m Arthrojxxls: IV, The tkxllc pyropharus," Journal 
of Oaxr.il ftjrinfect 3. 4M-4S8 (1929) 


12 6 INVERSE FUNCTIONS / 325 



entue real line In this case, if x ^ y, then ax + 0 ^ ay + P (see Figure 20 
(a) and (b)) On the other hand, if a = 0, then g(x) is not 1-1 (see Figure 20 
(c)) In this case, ax + p = ay + l 3 = P, regardless of the values of x and y 


FIGURE 20 



(a) (b) (c) 


The sine function is 1-1 between 0 and tt/2 and also between tt/2 and 
3 jt/ 2, but not between 0 and t since, for instance, sm 0 = sin ir = 0 (See 
Figure 19, Section 12 5 ) The functions x 2 and |x| are 1-1 in an interval 
from a to b if a and b are both positive or both negative However, neither 
of these functions is 1-1 in the interval [-2,2] since (-1) 2 = l 2 and 
l-ll *= |l| (See Figure 21 ) 


FIGURE 21 



PROBLEMS 


X. List some other intervals in which the sine function is one-one 
2. A function /is said to be strictly increasing if x < y implies f(x) < f(y) 
and strictly decreasing if * < y means fix') > f{y) (see Section 12 3) 
Prove that in cither case, /must be one-one 

3 Must a one-one function be cither strictly increasing or strictly de- 
creasing'’ 


Example 1 One of the fundamental problems of measurement is to show 
that a given empirical domain exhibits the same structure as some arithmet- 
ical structure of numbers If a common structure can be identified, the 


328 / DIFFERENTIAL CALCULUS 



arithmetic system is said to be isomorphic to the empirical domain Once 
an isomorphism has been established, questions about the empirical domain 
can be transferred to the arithmetic system, computations performed there, 
and the results transferred back and interpreted 
Suppes and Zinnes* have formulated a precise definition of isomorphism 
using the concept of a relational system According to their usage, a rela- 
tional system is a finite sequence (S, R u R 2 , , R n ) in which 5isa non- 
empty set of elements called the domain of the system and R u R 2 > » Rn 

are relations on S Two relational systems (S, R) and (T, Q) are called 
isomorphic if there is a one-one function /mapping S to T such that for all 
x and y m •S', 

xRy**f(x)Qf(y) 

As a simple example, let S — {2, 4, 6,8}, T = {1,6, 7, 10}, R be >, 
Q be <, and define/ by 

/( 2) = 10,/(4) = 7,/(6) = 6, and/(8) = 1 

Then it is apparent that x > y if and only if f(x) < f(y), that is, xRy if 
and only if f(x)Qf(y), so that the systems (5, R ) and ( T , Q) are isomorphic 
If the function /above is not necessarily one-one, the systems are said to 
be homomorphic or, more precisely, (T, Q ) is a homomorphic image of 
(S, R) Homomorphic relational systems are used by Suppes and Zinnes as 
the basis for a formal definition and classification of measurement scales, as 
follows 

Let U ~ ( S , R) be an empirical relational system and let /map U homo- 
morphically into a system V — ( T , Q) in which T is some set of real numbers 
Then the ordered triplet (U, V,f) is a scale 

Various types of scales are obtained by forming compositions g == <f>of 
in such a way that ( U , V, g ) is also a scale Specifically, (U, V, g) is 

(i) a ratio scale , if <p(x) = ax, where a > 0 
(n) an absolute scale, if 0(x) = x 
(hi) an mtertal scale , if <£(*) = a. -f 0 a, where 0 > 0 
(iv) an ordinal scale, if <f> is monotone 

(v) a nominal scale, if <t> is one-one ^ 


Since each point y m the range of a one-one function / is the image of 
exactly one point x m the domain of f we may define a new function g, 
called the inierse of f which associates with each y m the range of / the 
point Arm the domain off which maps into y (Figure 22) Symbolically, 

g(y) - x»/(v) = y (28) 

•Suppes P and Zinnes J L * Basic Measurement Theory in Handbook of \faihc 
n adcat Psychology Luce R D Bush R R andGalanlcr E. Eds (John Wiley & Sons 
Inc New York 1963) Vol 1 


126 INVERSE FUNCTIONS / 327 



FIGURE 22 



g 


Example 2 If /(x) = 2x + 1, then the function g(y) = Ky - 0 1S 
inverse to/ For, if 

j,=/(x)=2x+l 

then 

x = \(y - 1) = gOO 

and conversely (See Figure 23 ) ^ 

FIGURE 23 (a) The function f. (b) the inverse function g of f 


f S 



The inverse of a l-l function /is usually denoted by/ 1 In general, if 
/(x) =» y and /is 1-1, then 

/■ 1 (/(x))=/-* GO - x 

/(/ _, (y» =/(x) = y (29) 

follow immediately from the defining equation (28) 

Examplo 3 The nonsingular linear transformations introduced in Section 
10 3 provide good examples of functions which have inverses, but whose 
r inge and dom un arc not necessarily sets of real numbers For instance, 
the matrix 



328 / DIFFERENTIAL CALCULUS 



has the inverse 


A~ l 



Thus the function /defined by 


RX) = AX 

where 

has the inverse 

f~\Y) = A' l Y 


That is, 

f-\f{X)\=r\AX) = A ' 1 (AX) = (A~ l A)X = * 
/C/ _1 (T)] -f{A~ l Y) = A(A~ l Y) = (^"^y = Y ► 

PROBLEMS 

4. If/(x) = ax + b and a 0, show that the function g(y) = (1 /a)(y — b ) 
is inverse to / Graph the functions/and g for various values of a and b 

5. The function f(x) - ax ■+- 6 has no inverse if a - 0 Why** ( Warning 
It is not because division by a is not possible in (l/a)(y — b) ) 

6. Why does the function f(x) — x 2 have no inverse 9 

7. It should be obvious that if g is the inverse of /,/Ts also the inverse of g 
But, just to make sure this is obvious, write out a proof 

The trigonometric functions sine, cosine, and tangent are defined on the 
entire real line, as we have previously seen, but are not one-one there (see 
Figure 19, Section 12 5) Hence, these functions have no inverses However, 
if the domain is suitably restricted, a local inverse may be defined Let the 
function g be defined on the interval [— v/2, v/ 2] by g(x) = sin x Then g is 
1-1 and an inverse function defined by g“ l (y) = sin -1 y = v if and only if 
y — sin a can be determined Thus, sin -1 (1) = jt/2, sin' 1 (0) — 0, 
sin -1 (—1) = — jt/2, sm' 1 (V2/2) = t/ 4, and so forth (sec Figure 24) 
Actually, if the domain of the sine is restricted to any interval of length x 
beginning at an odd multiple of x/2, a local inverse function can be found 
However, the restriction indicated above is usually called the inverse sine 
function or sometimes, the arcsine function 

In similar fashion, inverses for the cosine and tangent functions are de- 
fined by restricting these functions to the domains (0, x) and (— s-/2, r/2), 


126 INVERSE FUNCTIONS / 323 



FIGURE 24 Graphs of the inverse trigonometric functions 

(a) inverse sine function, (b) mverse cosine function, 
(c) inverse tangent function 



respectively (see Figure 24) The domain of the restricted tangent function 
does not contain the endpoints since tan (ir/2) and tan (— x/2) do not exist 
Equation (29) shows that the composition of a function with its mverse 
always produces the identity function That is, if/ is one-one, then 

(/»/- 1 ) 0 ')=/(/- 1 0 ')) = ^ 

(/- 1 ’/)(*) =/ _, (/(*)) = * 

for all values x in the domain of / and y in the range of/ This, together with 
the Mean Value Theorem, enables us to obtain a formula for the derivative 
of the inverse of a function in terms of the derivative of the function itself 
In so doing we accept without proof the following theorem 

Theorem 9 Inverse Function Theorem Suppose /is a 1-1 function having 
an inverse/ -1 Then if/ is continuous, so is/ -1 ► 

Theorem 10 If /is a continuous one-one function having a continuous 
derivative/', then the inverse function g = f~ 1 has a derivative which is 

sW = /'feW) 

(Note that x is a point in the domain of g, not of/) 

PROOF Since /(g(x)) = x for all x, we can write h = (x + Ji) — x =* 
/(g(* + /')) — /(g(*)) Applying the Mean Value Theorem to / in this 
expression gives 

*-!*(* + *) -*(*)]/'(«) or « + ^ 

where t is some point between g(x) and g(x -f h) Now g is continuous 
(Theorem 9) so that g(x + /«) — > g(x) as h — ► 0 With t trapped between 


330 / DIFFERENTIAL CALCULUS 



g(x) and g(x + h ) we must also have t — > g(x) as h — > 0 Applying the con- 
tinuity of/' gives /'(0 — » f'(g(x)) as It — ► 0, or 


g'C*) = lim 

A-»D 


g(x + 7Q - gCjO 
h 


lim 777 -r 

A— »fl/ , (0 


1 

/'(*(*)) 


as asserted 


► 


To illustrate the use of Theorem 10, suppose g(x) = sin 1 x Then 

Dg(x) = D sin -1 x = 7 - — 5 — r — 7 -^ — : — r 

0 D sin (sin - 1 x) cos (sin - 1 x) 

Now cos 2 8 -4- sin 2 8 = 1 for all 8 Hence if 8 = sin " 1 x, so that sin 8 = 
sin (sin " 1 jc) = x, we have 

cos 2 5=1 — sin 2 0 = l — x 2 


Since cos 8 > 0 for all 8 in the range of the inverse sine function, it follows 
that 

D sin -1 x = — — — = for —1 < x < 1 

Vl - X 2 

Similar considerations yield 

D cos " 1 x — - , ~ z and D tan " 1 x = 7 — » 

1 +-* 2 

These proofs are left as exercises 


Example 4 To find the derivative of /i(x) = x 2 cos 1 2x, we first use the 
product rule to obtain 

Dh(x) = x 2 D cos " 1 2x -f (cos " 1 2 x)Dx 2 


Applying the Chain Rule gives 


D cos 2x = - 


-I 


vt^&x) 


D(2x) - 


VT— 4x 2 


so that 


2 r“ 

Dx- cos - 1 2x = - ■ ■ ■ - + 2rcos _I 2.x: 

VI - 


PROBLEMS 

In Problems 8-13 find lhe derivative of the giVLn funciion 

8 m(v) =» sin " 1 2x 9 /O) = tan’ 

t 0 * *,(*) = vcot " 1 (^x) 11 . tt(i) = esc" 1 1 ‘ 

12 . /(x) =» a~ cos' 


■0 


-© 


+ vva* — where a is a constant 


12-6 INVERSE FUNCTIONS / 331 



-1 1 

(b) D tan x = 1 + ~^ 

15. The graph of/(x) = 2x + 1 may be obtained from the graph of its in- 
verse g(y) = 2 O' — 1) ^ rotating the plane around the line /(x) - x 
(See Figure 23 ) Prove that the same is true of any function and its 
inverse 

16. Prove that if a function is strictly increasing (decreasing) then its inverse 
is also strictly increasing (decreasing) 

17. If, m Example 1, we let 

S - {2, 4,6,8}, {1,6, 7,10}, R be > and Q be < 

show that (5, R ) and (T, Q) are not isomorphic 

18 (a) If S - {4,8}, T = {12,16,20}, R is >, and Q is <, prove that 
(5, R) and (T, Q ) are not isomorphic 
(b) Are they homomorphic 9 That is, can you find a function /, not 
necessarily one-one, so that xRy if and only if f(x)Qf(y) l> (.Hint 
Try making (S, R) a homomorphic image of (T, Q ) ) 

19. If A is a subset of the domain of the function /, the image of A under the 
function / is defined as 

/04)= {/(x) xGA} 

For any two subsets A and B of the domain of /, prove that 

(a) f(A U B)= f(A) U f(B) 

(b) f(A n B) Qf(A) nf(B) 

20. For any two subsets A and B of the range off, prove that 
(a)/ _1 (<4UB) = f~ l (A) U/ -, (B) Note that for any set C in the 

range of/we define/-'(C) to be the set of all points * in the domain 
of/ for which f(x) €= C This definition is valid, and the notation 
/ HO will be used, whether or not /has an inverse 
(bU-'unii) =/-'(/!) r\f~HB) 

21. Prove that if h = go f denotes the composition of the function / X — * Y 
and g Y —> Z, then 

(a) h{A) = g[/(/t)l for every A c X 

(b) /i“’(Q = / -, [g~H01 for every C c Z 

22. Let A be a subset of the domain and B a subset of the range of the func- 
tion/ Prove 

(a )f-'[f(A)\^A 

(b ) /[/-*(*)] - B 

(c ) f[An/- l (.B)} C/(/l)nB 


13. /i(x) = x tan 

14. Prove that 

(a) D cos -1 x = 




332 / DIFFERENTIAL CALCULUS 



12 7 LOGARITHMIC AND EXPONENTIAL FUNCTIONS 


An exponential function h is any function of the form 
h{x) = b x 

where b is some positive constant For example, /( jc) = 2* g(x) = 10 x , and 
m(x) = 5 X are all exponential functions The function referred to as the 
exponential function is 

E(x) = e x 


where e is approximately 2 718 The constant e was introduced in Problem 8 
of Section 5 4 as the limit hm»_*, [1 4- (l/n)] n — e There we saw that an 
amount P of money invested at i percent and compounded continuously 
(that is, an infinite number of times per year) would grow in n years to the 
quantity P n = Pe in 

The natural logauthm function In is the inverse of the exponential function 
E That is, 

In y = x «=> y = e x 

This corresponds to the usual definition of the natural logarithm of a num- 
ber y as the power to which e must be raised in order to obtain y From our 
knowledge of inverse functions this means that 

ine x - x and e lnx = * (30) 

for all real numbers x In particular, \ne = 1, and e is called the base of the 
natural logarithms 

For any positive number b, the inverse of the exponential function 
/i(v) = b x 

is called the logarithm function to the base b and denoted log 6 Thus 

log b y = x<* y = b x (31) 

The logarithm to the base b of y is the power to which b must be raised in 
order to obtain y The most commonly used logarithms are logarithms to 
the base 10 Such logarithms are called the common logarithms 
The following properties of exponential and logarithm functions, familiar 
from algebra, are summarized here for reference purposes 
(0 By definition. 



(u) Tor any real numbers v and y, 

b** v = b x IP 


(m) Tor any positive numbers a and c, 

logi, ac = log& a + log* c 


127 LOGARITHMS AND EXPONENTS / 333 



(iv) If a is positive, then 


log* ~ * — logb a 

(v) Combining (m) and (iv) yields 

log* \ a ~ c 

(vi) For any exponent x, 

log* a x = x log* a 

We have seen that the denvative formula Dx n+1 = (it + 1)*" * s valid for 
all exponents n This formula thus determines a function whose derivative is 
x n for every n except n = — 1 For in that case we obtain Dx° = 0, rather 
than a derivative in a form involving x" 1 . This naturally raises the question, 
“Is there a function whose derivative is 1/x 7 ” The answer is “yes” and, as 
we now show, that function is the natural logarithm function 

Theorem 11 The derivative of the natural logarithm function is 

Dlnx*'- 

proof To obtain the derivative of In x, we write 

Din* = l,m lH- C* + h l ~ l » i ?> ^ l, m fin (l±±) 

a— * o h \ x J 

= lun hn(l + = lim In (l + -Y^ 

A-0 h \ xj A— »0 \ X) 

We now make the substitution z = x/h, thus changing the variable in the 
limit statement (note that r oo as h -» 0) to obtain 

D In x ~ lim In ^1 4- ^ ^ lim In ^1 + ^ 

Since lim*.,*, [1 + (1/z)]* = e and since In e = 1, it follows* that 

D In x = - 
x 

as claimed ► 

If y = In x so that a: = e y , then log* x = y log* x = (In x) log* e It fol- 
lows that 

D log* x = (log* e)D In x = (log* e ) ~ 

Putting this result together with the Chain Rule gives the following general 
theorem 

•According to Section 11 4, a function /is continuous at a point a if hm*_«/(x) = /(<*) = 
x) Thus the interchange of limit and in in evaluating (32) requires continuity of 
the logarithm function, a fact which we accept without proof 


334 / DIFFERENTIAL CALCULUS 



Theorem 12 If the function 4> has a derivative <f>', then 

(a) D In rfr(x) = 

(b) D log6 = Qog b e)^0f 

PROOF We shall prove (a) only, leaving (b) as an exercise Define the 
composite function /t(x ) by /i(x) = L(<p(x)) where L(y) = In y Then from 
the Chain Rule, we obtain 

D In *(*) = A'(x) = L'OKxWU) = </>'(*) = ► 


Example 1 (a) To differentiate h(x) = In (2x 2 

12(a) with 4>{x) = 2x 2 — 1 to obtain 

D( 2x 2 - 1) _ 


D In (2x 2 — I) = 


1), we use Theorem 
Ax 


2x 2 - 1 


2x‘ > 


1 


,, , n tn fv - D(x - 2) ‘ - 5<x - 2 > 4 - 5 

(b) D In (x 2) - (x _ 2)5 - (x _ 2)5 - x _ 2 

Alternatively, since In (x — 2) 5 = 5 In (x — 2), 


£> In (x - If = 5Z) In (x - 2) = 


x - 2 


(c) To differentiate fix) = x* we first take logarithms to obtain In /(x) = 
x In x Then 

= D In f(x) = Dx\nx-= *0 + In* 

Solving lor f(x) we have f\x) — xrfi -f- in xj 

(d) D log io (x 2 — x + 1) — (log 10 <?) -^- 'z — _p-y ► 


An interesting property of the exponential function £(y) = e x is that it 
is its own derivative To see this, we differentiate both sides of the equation 
In e x = x to obtain 

1 = Dx = D In e‘ = De‘ 

from winch it immediately follows that Dc z = e x This result, too, may be 
generalized by using the Cham Rule 

Thoorom 13 If the function £ has a derivative, then 

(a) Dl = S lj) D4>{') 

(b) Db+ M ~ (In 

In particular, Db* =* (In b)b'. 


127 LOGARITHMS AND EXPONENTS / 335 



PROOF (a) Define the composite function h by /i(x) = £($(*)) wIiere 
E(y) = e v Then using the Cham Rule, 

Be*'- 1 = h\x) = B'WWW'W = e*‘ x V(x) = e^’D^x) 

(1 b ) Since b = e 1,lb , we have 6* tx> = e tlnb> *‘ lx) . Thus 

B6* lx) = Be anM *“ = (lnhW'(x)e <l " M *‘’ > 

= (In tyb^’D^x) ► 

Example 2 (a) To find the derivative of /i(x) = v 2x ' -1 , we use Theorem 
13(a) with fi(x) = 2x 2 — 1 to obtain 

Be 2 ' 2 - 1 = [B(2x 2 - l)]e 2x1—1 = 4xe 2x ’ -1 

(b) Be' x-2>1 = [fl(x - 2) 5 ]e (x-2)S = 5(x - 2)V* -21S 

(c) B[e x2 In (2x - 1)] = e” D In (2x - 1) + [In (2x - l)]fle x ’ 

2e x * t 

= 2^1 + 2* e ln (2x- 1) 

(d) D(IQ x3 -*+') = (In 10)(2x - l)10* z-z+l ► 


Example 3 In his classic theory of learning, Hull* assumed that habit 
strength is given in terms of the number N of repetitions by 

H(N) = 100(1 - e -') 

where i is a positive constant Assuming for purposes of illustration that N 
can be any positive number, the instantaneous rate of change of habit 
strength is 

H\N) = —\00e~ tN D(—iN) = 100 «?“** 

Here we see that the constant i indicates the rate at which habit strength is 
acquired Moreover, since H\N) is always positive, habit strength is a mono- 
tone increasing function of the number of repetitions ► 


Since De z = e* is always positive, it follows that the exponential function 
is increasing for all x Moreover, since the derivative e* increases as x in- 
creases, the function must increase at an ever increasing rate These proper- 
ly are evident in Figure 25(a) In a similar way, we see that the slope 
. n * — }' x °** nat ural logarithm function is positive but decreasing, 
which leads to the graph of Figure 25(b) 

Not only does the exponential function turn out to be its own derivative, 
“ 1 ' S essentia lly the only function having this property To see this, sup- 
pose that g is another function for which g’{x) = g(x) and define a function 

Hull C L Principles of Behactor (Appleton Century Crops, New York, 1943) 

336 / D I F F E 


RENTIAL CALCULUS 



FIGURE 25 




h by /i(x) = g(x)/e x Differentiating h gives 

h'(x) = . o 

since g'(x) = g(jc) and De r = e x It follows that h is a constant function and 
that g(x) = ce 1 for some constant c 

T he above discussion can be generahzed to the case where the derivative 
g ' is proportional to g, that is, where g'C*) = Ag(x) for some constant k 
An argument similar to that used above shows that in this case g must have 
the form g(x) — cc kx , where c is constant The proof is left as an exercise 

Example 4 A typical example of natural phenomena in which the rate of 
change of a certain quantity is proportional to the size of that quantity is 
afforded by the growth of a bacterial culture If we make the reasonable as- 
sumption that the number of reproductions taking place at any instant of 
time is proportional to the number of bacteria present at that time, then 
iV(/) t the number of bacteria present at time /, satisfies the equation 

<V'(/) = kN(0 

The number of bacteria in the culture can then be expected to grow expo- 
nentially according to the law A(/) = cc u 

For example, if we begin (at / =* 0) with 100 bacteria which are observed 
to grow in i = 1 day to 1000 bacteria we find 

A(0) - c - 100 and A(l) = ce k * 1000 

12.7 LOGARITHMS AND EXPONENTS / 337 


PROOF (a) Define the composite function h by /i(x) = )) where 

E(y) = e v Then using the Chain Rule, 

De* u) = h'(x) = E'(*(x))£'(x) - ‘"‘VW = c ilx) D^x) 

(b) Since b = e 1 ” 6 , we have = e (1 "“* <xl . Thus 

£>&*“ = De a '' K * M = (In /i)*'(x)e" n iWl1 

= (lnb)b* ,x 'D$(x) ► 


Example 2 (a) To find the derivative of 7i(x) = e 2 * 2-1 , we use Theorem 
13(a) with 4(x) = 2x 2 — 1 to obtain 

De 2 *'~' - [D(2x 2 - Ole 2 * 2 -' = 4xe 2 * 2— 1 

(b) De<'- 2 >‘ = [fl(x - 2) 5 ]e <x-2)5 = 5(x - 2)V*- 2|S 

(c) B[e 2 ’ In (2x - 1)] = e x 'D In (2x - 1) + [In (2x - \)]Dc x '~ 

= i + 2xe * : ,n (2x — 1) 

(d) D( I0’ ! - I+ ') = (In 10)(2x - l)10‘’- x +' ► 


Example 3 In his classic theory of learning, Hull* assumed that habit 
strength is given in terms of the number N of repetitions by 


H(N) = 100(1 - e->") 


where i is a positive constant Assuming for purposes of illustration that N 
can e any positive number, the instantaneous rate of change of habit 
strength is 

H ’W= —l00e~ tN D(—iN) = 100 ie“ ,w 


” I,' ™ “ that the constant 1 mdicates the rate at which habit strength IS 
tone nor T' SmCe 15 al ™y 2 Positive, habit strength is a mono- 
tone increasing function of the number of repetitions ► 


is ^ n alW .7 S P ° Sltlve ’ “ W l»* that the exponential function 

creases ih % * Moreover ’ s, nce the derivative e* increases as x in- 

hes are e ", c mUS ‘ at an ever tncreasing rate These proper- 

D\n x = i/ v n *^ Ure a Slm 'lar way, we see that the slope 

which leads ^ nat ural logarithm function is positive but decreasing, 

wmch leads to the graph of Figure 25(b) 

but u'is " ly d ° es ,he cxponci, tiaI function turn out to be its own derivative, 
pose Zrf” 1 ’ fUnCt ' 0 " havin S this property To see this, sup- 
pose that g ,s another fund, on for which g'(x) = g(x) and define a function 

l . c L , Principles of Dehnuor (Appleton Century Crofls, New York, 1943) 

“0 / OIFFEHENTIAL CALCULUS 



I. g(w) = e* 2 8. fi(x) = e Xani 

9. /i(v) = x 2 e~ x 10. g(u) = e u In (sin u) 

II. /( w) — c~* cos iv 12. fix) = sin" 1 e x 

13. Prove part (b) of Theorem 12 

14. Derive properties (iii)-(vi) for logarithmic functions from Equation (31) 
and properties (0 and (n) for exponential functions 

15. Prove that if g'(v) = kg(x) where A is constant, then there is a constant 
c such that £(x) = for all v 

16. The world contains approximately 10 billion acres of arable land As- 
suming that a minimum of one-quarter of an acre is required to provide 
sufficient food for one person, it would seem that world population is 
limited to at most 40 billion The 1965 population was about three billion 
If world population continues to grow at the present rate of 1 8% per 
year, when will the 40 billion mark be reached 9 


12.8 APPLICATIONS OF DIFFERENTIATION— 

OPTIMIZATION AND STABILITY 

The classical use of differentiation is to find the optimum value or values of 
a function This problem arises in economics, for example, when one wishes 
to maximize his profit or to minimize his loss A function / is said to have a 
relative maximum at a point a if there exists a number 5 > 0 such that 
fix) < f(a ) for all points x in the domain of f which lie within distance 5 of' 
a Similarly, we say that / has a relative minimum at a point b if there is a 
5 > 0 for which fix) > fib) whenever x e D(f) and \x — b\ < 5 (See 
Figure 26 ) 

A point c is called an interior point of a set S if there is a number 5 > 0 
such that all points x lying within S distance of c are members of S (Note 
that here, as above, the actual size of 5 is not important except that 5 must be 
greater than zero It may be very small or it may be large The existence of 


FIGURE 26 



Hence c = 100 and k = In (1000/c) = In 10, so that 
N(i) - 100(e ,nlo )‘ = 100(10)* 

If the growth continues at the same rate, there will be N( 6) = 100(10) , or 
100 million bacteria present at the end of 6 days ► 


All our results concerning derivatives are summarized m Table 2 

TABLE 2 Short Table of Derivatives 


1 Dc = 0 

2 Dx - 1 

3 Dx" - nx"~ l 

4 D[f(x) + g(x)} - Df(x) + Dg(x) 

5 D[c/(x)l - cDf(x) 

6 D[c,r,(x) + c/ 2 (x) + + c,/ n (x)] =. C]Df\{x) -f c 2 Of 2 {x) + 

7 D[f(x) g(x)] = f{x)Dg(x) + g{x)Df{x) 

„ J '(*)! 0(x)Df{x) - f(x)Dg[x ) 

8 d UmJ" SwP 

9 D[g{f{x))]~Dg[f(x)} Df(x) 

10 DW(*)) n ] - mx)) a ~' Df(x) 


+ c n Of n (x) 


if g[x) * 0 


11 

D sin x = cos x 


12 

D cos x — sin x 


13 

D tan x = sec 2 x if i 

cos x 0 

14 

O cot x = —esc 2 X 

if sin x 0 

15 

D sec x = sec x tan x 

if COS X 9 -i 0 

16 

Dcscx - -esc x cot x 

if sin x 5* 0 

17 

D -in - * ^ 

for -1 < x < 1 

* 

18 

D ccr -1 x - — 1 

for -1 < x < 1 

* V' -* 1 

19 

D tan~' X - — - — - 
1 +x 2 


20 

D In x = - for x > 

X 

0 

2) 

Oloa ,x-25S^> 

for x > o 

22 

De* = e* 


23 

Db* = (In b)b* 


problems 



Fmd derivatives of each of the following functions 
!• /(x) = In (x + 2) 2 . i 

3 -'' W ~ 2 ' *./W = [|n'& + 2) ! F 




S. fix) » 2x‘ 


6 f{t ) = *<>«w 


338 / DIFFERENTIAL CALCULUS 


The basic tools for problems involving maxima and minima are the follow- 
ing two theorems 


Theorem 14 huenor Maximum and Minimum Thcoiem Suppose a func- 
tion / has either a relative maximum or a relative minimum at a point c 
interior to the domain of / Then if /'(c) exists, it must be equal to zero 

PROOF Suppose / has a relative maximum at the interior point c Then 
for some 5 > 0 ,/(c + A) < /(c) for all h such that — 5 < h < 8 It follows 
that the ratio [/(c + /i) — f(c)]/h is negative when h > 0 and positive when 
h < 0, so that zero is the only possible value for 


f'(c ) = lim 
A-* 0 


f(c + h) -/(c) 
h 


The proof for the case when / has a relative minimum at c is similar and is 
left as an exercise ► 


Theorem 15 Criteria foi lelatiue maxima and minima Suppose that the 
function / is continuous and has a derivative throughout an interval [a, b] 
(This interval may comprise part or all of the domain of/) 

(a) If c is an interior point of [a, b ] and if for some number 5 > 0, we have 
f\x) > 0 when c — 5 < x < c and /'(*) < 0 when c < x < c -f- <5, then 
/ has a relative maximum at c 

(b) If c is an interior point of [a, b ] and if for some 5 > 0 we have /'(*) < 0 
when c — 8 < x < c and /'(*) > 0 when c < x < c -f 5, then / has a 
relative minimum at c 

(c) If there is a number 8 such that f'(.x) > 0 (orf'(.x) < 0) when b — 8 < 
x < b, then / has a relative maximum (or relative minimum) at b (The 
maximum or minimum is, of course, relative to points in [a, 6] which lie 
near b ) 

(d) If there is a number 5 such that /'(-*) > 0 (or /'(*) < 0) when a < 
x < a -f 5, then /has a relative minimum (or relative maximum) at a 

PROOF The truth of the theorem is intuitively obvious from the sketches 
in Figure 30 We shall give a formal proof only for part (a), the other cases 
being similar We first choose points x and y , as m Figure 30(a), in such a 
way that e — 5<^c<c<>><c+5 Then the Mean Value Theorem 
guarantees the existence of points a, between x and c t and /3, between c and 
y , for which 

/(C) '/(X) = (c - x)/'(«) and f(y) -/(c) = (y - c)/'G8) 

The assumptions for part (a) guarantee that /'( a ) is positive while is 
negative Since c — a > 0 and y — c > 0, it follows that /(c) > /(v) and 
/(c) >/ 0) This being true for arbitrarily chosen x and y points neir c, 
\\c see that / has a relative maximum at c ► 

128 APPLICATIONS OF DIFFERENTIATION / 341 



even one value of 5 which works is sufficient The same considerations were, 
of course, implicit m the definition, in Section 1 1 4, of continuity at a point ) 

Example 1 (a) In the interval [0, 1], all points except the endpoints 0 and 
1 are interior points Tor if 0 < c < 1, we simply take 8 to be the smaller of 
the distances from c to the endpoints, that is, the smaller of c and 1 — c 
Then if |x — c| < 8, we must have x between 0 and 1 also (Figure 27) 

FIGURE 27 


-interior point of the interval 


(b) The absolute value function has a relative minimum at zero since 
|0| = 0, and for all other x, [x[ > 0 (Figure 28) 


\ 

/ fix )- M 

relative minimum ' 



. ° * — 3 whenO < x < 4hasarela- 

m when x - 0 and a relative maximum at x = 4 (Figure 29) ► 


FIGURE 29 



ERENTIAL CALCULUS 



FIGURE 31 


H 



Example 3 The derivative of/(f) = / 3 ~ 5 / 2 + 3/ + 12 is f'(t) = 3 / 2 — 
10/ + 3 = (3/ — 1)(/ — 3) A little checking shows that this derivative is 
positive when t < zero at t = 3 , negative between / = £ and / = 3, zero 
again at f = 3, and positive when t > 3 It follows from Theorem 15(a) and 
(b) that / has a relative maximum when t = -3 and a minimum at t = 3 
(See Figure 32 ) ► 

FIGURE 32 


/ 



Example 4* Suppose that the total number n of repairs over the useful 
life / of a consumer durable is given by the function «(/) = t a /\ 3, where a. 
and 0 are positive constants Since an original unit is replaced after time /, 
the average number of replacements per unit of time is \/t and the average 
replacement cost per unit of time is c(t ) = p/t, where p is the price of a new 
unit 

From the definition of n, the number of repairs per unit of time js /;(/)//, 

•Adapted from Brcms H , Quantitative Economic Theory (John Wiley & Sons, Inc , New 
York, 1968), pp 38-39 


128 APPLICATIONS OF DIFFERENTIATION / 343 



a a+6 


b-6 b 


a a+6 


b—6 b 


The procedure for finding maximum and minimum values (or extreme 
values) of a function is now clear from Theorems 14 and 15 These values 
may occur at the endpoints of the domain of the function (Typical cases 
are linear programming problems like Examples 1-3 of Section 9 5 ) Parts 
c an ( ) o Theorem 15 also apply here When extreme values occur at 
“T* ^ denvat,ve ma y not ex *t (as in Example 1(b)), but if it 
find aCC0 . rd,ng !° The M<™ 14, it must be zero In summery, then, to 
find extreme values of a function 

(0 Check the endpoints of the domain by applying Theorem 15(c) or (d) 

(U) tffrfando!) WheTe the denva,,ve fails t0 exls ‘. applying Theorem 

(m) “ e ^, th0se PO*" ts where the derivative is zero, apply, ng Theorem 14 
and Theorem 15(a) and (b) 

of habit ^^ xam P^ e 3 °f Section 12 7, the instantaneous rate of change 
conint W V aS *T d 40 be " 100 -" liV > -here , is a positive 

orem l'ifH'i i ,? ^ on ne gative and H'(N) is always positive, The- 

( ) implies that H has a minimum at N = 0 (See Figure 31 ) ► 


DIFFERENTIAL CALCULUS 



FIGURE 31 


H 



Example 3 The derivative of/(/) = / 3 — 5 r + 3/ + 12 »/'(/) = 3 / 2 — 
10/ 4- 3 — (3/ — 1 )(/ ~ 3). A little checking shows that this derivative is 
positive when t < zero at t = negative between / = 5 and t = 3, zero 
again at t — 3, and positive when i > 3. It follows from Theorem 15(a) and 
(b) that / has a relative maximum when t ~ 3 and a minimum at t = 3. 
(See Figure 32.) ► 

FIGURE 32 


/ 



Example 4* Suppose that the total number /; of repairs over the useful 
life t of a consumer durable is given by the function n(r) = t a /P, where a 
and (3 are positive constants. Since an original unit is replaced after time /, 
the average number of replacements per unit of time is \ft and the average 
replacement cost per unit of time is c(f) = p/f, where p is the price of a new 
unit. 

From the definition of n, the number of repairs per unit of time is «(/)//, 

'Adapted from Brcms, H , Quanntaitcc Economic Theory (John Wiley &. Sons, Inc., New 
York, 1968), pp. 38-39. 


128 APPLICATIONS OF DIFFERENTIATION / 343 



and if the average cost of a repair is r, the average repair cost per unit of 
time is r[H(0/f] Ignoring capital costs, rental costs RC per unit of time may 
thus be defined as . . 

RC(l) = E + r m 


Substituting t a /fi for n(f) gives 

KC«)-j + y°~' 

the derivative of which is 


This derivative is zero when /" = (3p/r(a — 1) or / = [fip/r(a — l)] 1/a » 
negative to the left of this point and positive to the right The point 



yields the minimum value 


pa \ r(a - l) 1~ 1/a 

« ~ lL PP J 

for tlic average rental cost per unit time 
A closer analysis of RC(t ) shows that for small /, the per-unit-time re- 
placement cost p / 1 dominates As t increases, this term decreases to zero 
while the per-unit-time repair cost (r/d)i a ~ l increases In terms of average 
rental costs, l f)p/r(a — l)j ,/a represents the optimal time for replacement of 
the item in question ► 


Examplo 5 Equilibrium and Stability In his analysis of the forces which 
activate an international arms race, Richardson* postulates that the rate 
of defense expenditure by a nation tends to be increased by such items as 
expenditures by a rival nation and by grievances held by a nation indepen- 
dently of existing defense budgets He also assumes that the burden of main- 
taining a defense budget tends to decrease the rate of expenditure Thus for 
two nations, Richardson's model of defense expenditure can be expressed as 

£T0 - aR(/V r (/) + g 

R\t) =* bE 4 + /, 


( 33 ) 



zero The solution to the simultaneous equations 
E' = aR — aE -f g = 0 

( 34 ) 

R' = bE - PR -f- h = 0 

is called the equilibrium point, or the point of balance of power of the system 
If ab — a/3, the two lines m (34) are parallel and no equilibrium point 
exists The system is obviously unstable in this case When ab ^ ap, the 
point ( E 0 , i? 0 ) of balance of power is given by 

p - ah + ftg p - 
0 a/3 — ab 0 a/3 — a b 

To obtain a better picture of the behavior of the system, let us indicate by 
horizontal shading the region in the E-R plane in which E' > 0 and by 
vertical shading the region in which R' > 0 Two examples are shown in 
Figure 33 Double shading indicates the region in which both E and R tend 
to increase and hence in which there is a drift toward a mutual arms race, 
while in the unshaded region there is a tendency toward mutual armament 
reductions and greater cooperation 

FIGURE 33 Two examples of an arms' race Heavy arrows indicate the 
direction of movement of the system 



equilibrium 
point (2 2) 


equilibrium point (00) 

(a) a=b = 3 g—h= 0 


(b) a=b = 1 ct=/3 =*3 = 4 


Stability of the system depends upon whether the movement over time is 
toward or away from the point of balance of power In Figure 33(a), for 
instance, the doubly shaded region lies above and to the right of the equilib- 
rium point (0, 0) In this region, then, the tendency is toward cver-mcrcasing 
expenditure by both nations and away from the point of equilibrium An 
unstable situation obtains 

128 APPLICATIONS OF DIFFERENTIATION / 345 



and if the average cost of a repair is i, the average repair cost per unit of 
lime is r[n(t)/t] Ignoring capital costs, rental costs RC per unit of time may 

thus be defined as ... 

nrM _ Pi _ n \*l 


RC(t) = L t + ‘ 


Substituting r“/(J for n(<) gives 


RC( I) = 7 + ^ 


the derivativ e of which is 


0-2 _ P f r(g - 

"4 


This derivative is zero when t“ -- 0p/r(a — 1 ) or l = [3p/r(a — 1 )] ' 
negative to the left of this point and positive to the right The point 


. = r kp 

- 1 


yields the minimum value 


pa [r( o - I) ] 

o - >L Up \ 


for the average rental cost per unit time 
A closer analysis of flC(r) shows that for small r, the per-umt-time re- 
placement cost p/ I dominates As r increases, this term decreases to zero 
while the per unit-time repair cost (r/jl) i a ~' increases In terms of average 
rental costs, [dp/r(a — represents the optimal time for replacement of 
the item in question ► 

Example 5 Equilibrium and Stability In his analysis of the forces which 
activate an international arms race, Richardson* postulates that the rate 
or defense expenditure by a nation tends to be increased by such items as 
expenditures by a rival nation and by grievances held by a nation indepen- 
dently of existing defense budgets He also assumes that the burden of main- 
taining a defense budget tends to decrease the rate of expenditure Thus for 
two nations, Richardson's model of defense expenditure can be expressed as 

E (r) = aR{t ) — a£(l) -f- p 

(33) 

R (r) = bE(i) - 0R(,) + /, 

where £ and R denote expenditures or the respective nations, a and b arc 
positive Constants reflecting the degree or fear and insecurity regarding the 
intent of the other nation, a and 0 (> 0 ) relied the burden of maintaining a 
defense budget, and g and h indicate the respective grievances 

me system (33) is said to be in equilibrium when there is no tendency for 
£ and R to change. That is, equilibrium occurs when £'(r) and R'(l) arc both 
■Ruhanlion, L. I , torn uaj l„„umy (Uovwood Picss, Pittsburgh. Pa . I960) 

/ DIFFERENTIAL CALCULUS 



21. Referring to Example 5, draw a diagram depicting the behavior of the 
system when a = 6 = 2, a = $ = 1, and g = h =* 3. 

22- Prove Theorem 14 for the case of a relative minimum. 

23. Prove parts (b), (c), and (d) of Theorem 15. 


12.9 HIGHER-ORDER DERIVATIVES; 
TAYLOR'S THEOREM 


We have defined the derivative /' of a function /to be a new function which 
assigns to each point x in the domain of/ the number 


/'(*) = lim 

A -.0 


fix ± h) -fix) 
h 


The derivative of /', that is, the derivative of the derivative of f is called the 
second derivative of f denoted by f" or by D~f In symbols, 


/"(*) = 


, im ns+ . n -.r.w 

h - 0 h 


(35) 


When the Leibniz notations 

y = fix) and f'(x) = & 


are used, the second derivative /"(*) is usually denoted by 

d-y 

dx 2 


Example 1 (a) If y = f(x) = 4x 3 + \x 2 + x — 1, then 

fix) = ^ = Ux 2 + * + 1 
f"M = ^ = 24x+ , 


(b) The learning function 


fix) 


Lx + Lc 
x + c + a 


introduced in Problem 21 of Section 12.1 has the derivative 


fix) 

The second derivative is 

) = D 


La 


(x + c + a) 2 
La —2 La 


(x -f c + ay (x+*c+ a) ! 


12.9 HIGHER-ORDER DERIVATIVES / 347 



On the other hand, in Figure 33(b), the drift in both the shaded regions 
(increasing expenditure) and the unshaded region (decreasing expenditure) 
is toward the equilibrium point (2, 2). The system has a built-in stability m 
that any perturbance away from the point of balance of power is met by a 
gradual return to that point. ^ 


PROBLEMS 


In Problems 1-13, find the relative maxima and/or minima of indicated 
functions 


3. h{t) = - 


1. f(x) - 2x - 3x 

2 1 

\ + t* 
y/w 

' 4+ w 
1 - 2x + x 2 


5. /(w) = 
7. h(x) = 


1 + x + 2x2 


2. g(*) = %x 3 — 2x 2 + 3x + 7 
4 - /(<) = ' 


6. g(z) - (z - l) s,3 (z - 4) 2 


«./(*) - 


(4 - x? 
3 - 2x 


9. *(y) - yV 1 - 4^2 10. /i(x) = X 1 ” 

11. g(l) - In (sin I) 12. g(iv) - e - ” 1 

13. r(6) - e -0 * sin 

14. What is the largest product which can be obtained using two positive 
numbers whose sum is 2 9 


15. The product of two numbers is 56 What is the smallest possible sum of 
the squares of the two numbers 0 

16. What number exceeds its square by the greatest amount 0 

17. A model of class size postulates an educational return of 10 units per 
student for a class size not exceeding 25 Return per student is postulated 
to decrease by 2% for each student in excess of 25. What class size gives 
maximum educational return 0 

18. As in Example 2 of Section 12 1, let x denote quantity demanded, and 
let p(x), r(x) = x />(*), C(x) and 6(x) = r(x) - C(x) be, respectively, 
the demand, revenue, cost, and profit functions, all of which are assumed 
to bo differentiable for x > 0 

(a) Argue that maximum or minimum profit can occur (aside from 
x = 0) on ‘y when marginal revenue equals marginal cost 

(b) Show that if we assume p(x) = a - px and C(x) = bx 2 + ax + k, 
where all constants are positive, then profit is maximized at output 
level *„ = («- a )/ 2(0 + b) 

19. Prove that marginal revenue curve intersects the average revenue 
A (x) = r(x)/x at the point where average revenue is a maximum 

20. Assume the total cost function for a firm is TC(x ) = 75 + 2* — 2* 2 + 

where x denotes output level 

(a) Find the output level for whieh total cost is a minimum 

(b) Determine the output level which minimizes average cost. Verify 
t at at this level, marginal cost is equal to average cost. 


346 / 


DIFFERENTIAL calculus 



FIGURE 35 



The third derivative of a function / is denoted by/'" or by D 3 / and, in general, 
the //th derivative is denoted by / {n> or by D n f (Following the Leibniz 
notation, the notations d 3 y/dx 3 and d n y/dx n are also m common use ) 

Example 3 (a) lfg(*) = x 3 — x 2 + 9x + 2, then g' (x) = 3x 2 — 2x + 9, 
g"(x) = — 2, g"'(x) = 6, and g (4) (x) « 0 In fact, g ln) (x) = 0 for all x 

if n > 4 

(b) If h{t) = sm t y then = cos /, D 2 A(/) = —sin /, D 3 h(t) = —cos /, 
2> 4 A(/) — sm f, etc The cycle is repeated at every fourth derivative 

(c) The third derivative of the learning function (see Example 1(b)) is 

/"'(*) = 6la/(x + c + «) 4 ► 


PROBLEMS 

1. Show by induction that the «th derivative of the learning function 
Lx -h Lc 


fix) = 


is given by 

/■’w ■ 

2. Show that for n = 1, 2, 3, 


x + c + a 

(x + c -F a)" +I 
the nth derivative of In v is 


(- 1 )—'(/■ - o ' 

x" 

3. Find formulas for the nth derivatives of c* and cos v 


129 HIGHER-ORDER DERIVATIVES / 349 



(c) In Example 6 of Section 12 2, the satisfaction function S(R) = 
aR/(g - R) was found to have the derivative S'(R) = ag/(g - R) its 
second derivative is 

4 aR 1 T og 1 2gg_ ► 

D ~ L(g - «> 2 J (g-w 

Some insight may be gained into the meaning of the second derivative if 
we recall our original interpretation of a derivative as a rate of change The 
second derivative indicates the rate of change of the first derivative, which in 
turn represents the slope of the function being differentiated If the second 
derivative is positive, the slope must increase, while if the second derivative 
is negative the slope decreases The two cases are shown in Figure 34 A 
function whose second derivative is positive is called convex , or, sometimes, 
concave upward, while one whose second derivative is negative is concave, or 
concave downward 


FIGURE 34 



(a) positive second derivative (b) negative second derivative 


Example 2 The first derivative of /(x) = x 3 — 3* 2 -f- 4 (see Figure 35) is 
/'(*) = 3x 2 - 6x = 3x(x - 2) 
and the second derivative is 

/"(*) = 6x - 6 = 6(x - 1) 

(See Example 3 of Section 12 3) When x is less than 1, the second derivative 
is negative, and hence the function is concave The second derivative is posi* 
tive and the curve convex when x exceeds 1 ► 

The point (1, 2) in the above example is the point at which the graph 
changes from concave to convex It is useful to distinguish such points from 
other points on a given graph by the name point of inflection 

ird* and higher-order derivatives are obtained by successive differentia* 
tion, that is, the third derivative is the derivative of the second derivative, 
the fourth derivative is the derivative of the third derivative and so forth 


348 / DIFFERENTIAL CALCULUS 



Combining the first term — /'( x) with the first sum and then replacing k by 
k — 1 in this sum shows that all terms in the first two sums cancel, except 
the term [— / ul (x)/(n — l) 1 ] (6 — x)" -1 Inserting x = c, where = 0, 
gives 


0 = 


-/ w (c) 
(» - 1)1 


(b - c )"- 1 

+ j/(4) -m - 


E 




f‘%) 
k i 


(b - a) 


‘] 


71(6 - c) n ~ l 
( b ~ a) n 


Cancelling the (nonzero) common factor ( b — c ) n ~ l , multiplying by 
( b — a) n /n, and solving for f(b) yields the desired equation (36) ► 

One typical use of Taylor’s Theorem is in conjunction with the Interior 
Maximum and Minimum Theorem According to that theorem, at any maxi- 
mum or minimum point of a function / which is interior to the domain of/, 
the derivative — if it exists — must be equal to zero Suppose, then, that 
f’{x ) — 0 at x — a and that the second derivative of /exists in an interval 
which contains a as an interior point Taylor’s Theorem with n = 2 becomes 


f(b) — f( a ) = (b — a) 2 


where c lies between a and b If we assume that /" is continuous at a , so that 
/"(c) will have the same sign as f"(a) if we take b (and hence c, also) suf- 
ficiently close to a, the following facts may be observed (a glance at Fig- 
ure 34 will make the statements geometrically obvious) 

(i) If /"(a) > 0, then \ f"(c){b — a) 2 > 0 and f(b ) > f(a) for all b 
near a Hence a is a point of relative minimum 
(u) If f”(a ) < 0, then \ /"(c)(6 — a) 2 < 0 and /(6) < f(a) for points 
b near a In this case, a is a point of relative maximum of / 

(in) If f"(a) = 0, no definite statement can be made since f" may be 
both negative and positive at different points near a 


Example 4 Let R(x) and C(x) denote, respectively, the revenue and cost 
associated with the production of x units of a certain commodity Then the 
profit function P = R — Cis maximized at x = a if P'(a) = 0 and P"(a) < 0 
The requirement P'(a) — R'(a) — C'(a) = 0 states that marginal revenue 
must equal marginal cost The condition P"(a) = R"(a) — C' (, a ) < 0 or, 
equivalently, R”(a) < C"(a), indicates that at the point of greatest profit, 
marginal cost must be increasing more rapidly than marginal revenue ► 


If /"(a) is zero when /'(<*) = 0* our test for a relative maximum or mini- 
mum yields no results However, this may be remedied as follows Suppose 
the derivatives ), , and/ (n ~ l> (o) are all zero and that/ <n, (i7) ^ 0 

is the first nonzero derivative of/at a Then Taylor s Theorem becomes 

/(b) - /(a) = (6 - a)" (a <c<b) 


129 HIGHER-ORDER DERIVATIVES / 3S1 



The most important applications of second and higher derivatives arise 
,n connection with an extension of the Mean Value Theorem, called Taylor s 
Theorem * 


Taylor’s Theorem Suppose the function / and its first n derivatives / , 
jn are defined and continuous in an interval [a, 6] Then 

there is at least one point c between a and 6 for which 


fib) = fifl) +f'(a){b - a) + ( b - 


af + 


f"Xa ) 


( b - af + 


r 


(" - O' 


^ (6 - n)”- 1 + (6 - «)" 


(36) 

► 


Before proving Taylor’s Theorem, let us see how it relates to the Mean 
Value Theorem When n = 1, Taylor’s Theorem states that, subject to the 
assumed existence of the first derivative, there is at least one point c (where 
a < c < b) for which /(&) = f(a) +f'(c)(b - a) This, of course, is the 
statement of the Mean Value Theorem 


proof OF TAYLOR s THEOREM We begin, as in the proof of the 
Mean Value Theorem, by inventing a function which serves our purpose 
and to which we can apply Rolle’s Theorem It is easily verified that 


<K*) = f(b) - f(x) - - x ) 

- -*)«-• 


/"(*> 


(ft - *) 2 


(n - l)i 

- \m -m -na\b - - 


q*(b-ay- 


-m-m- 


£^(ft-*)‘ 


( n - 1)' 


(ft - «)' 


1 J (6 - a)" 


is such a function That is, = <f>(b ) = 0 and <$>' exists since / has n 
derivatives Hence Rolle’s Theorem guarantees the existence of a point c 
between a and b at which c#/(c) = o 
Differentiating <t> gives 


«*) “ -/'(*> - £ ^ Kb - 1) - £ (ft - *? 

i -‘ i-i R 

- [/(ft) - /(-) - £ ^ (ft - a)‘l " (6 

L t—i K J (o — a) rt 

•After Brook Taylor (1685-1731), an Enghsh mathemat.can 


350 / DIFFERENTIAL CALCULUS 



Example 5 (a) The graph of fix) = x 3 is shown in Figure 36. At x = 0, 
both the first derivative /'(x) = 3x 2 and the second derivative /"(x) = 6x 
are zero. The third derivative f w (x ) — 6 being the first nonzero derivative, 
it follows that x = 0 is neither a maximum nor a minimum point for /. 

(b) The function /(x) = (4 — x) 4 has first, second, and third derivatives 
/'(x) = -4(4 - x) 3 , /"(x) = 12(4 - x) 2 , and /'"(x) = -24(4 - x) 
equal to zero when x — 4. The fourth derivative / (4) (x) = 24 being positive, 
we conclude that/(4) = 0 is a relative minimum value for /. (See Figure 37.) 

PROBLEMS 

In Problems 4-13, find the first and second derivatives of the given function 
4. fix) =* x 3 + 3x 2 — 2x + 6 5. g(u) = « 5 — 2u A + 

6. s(z) ~ 'Vz* — ^2 7. w(x) = Vx 

Vx 

8. h(y) — Vy2 + 1 9. /O’) = 

2 + Vy 

10. y(x ) = xsm 2x 11. fix) = ^ 

12. /*(/) = t sm _1 1 13. g(x) = In (x — v / x"M r o 5 ) 

14. Find the first four derivatives of 

(a) x 3 — 2x 2 4- x — 8 (b) x 3 — In x 

(c) e"** ~f cosx (d) (sin x) 4 

15. In each case, the behavior of a function is described, for all points x in 
an interval [ a , b\, m terms of its first and second derivatives Sketch a 
possible graph of the function 

(a) fix) > 0 and fix) > 0 (b) fix) > 0 and fix) < 0 
(c) fix) > 0 and fix) = 0 (d) fix) < 0 and fix) > 0 
(e) fix) < 0 and fix) < 0 

16. In each case the behavior of a function at a fixed point c is described in 
terms of its derivatives Sketch a possible graph of the function 

(a) /'(e) = 0 and fic) > 0 

(b) /'(c) = 0 and fic) < 0 

(c) /'(c) = 0 ,/"(c) = 0 and f'ic) ^ 0 

(d) /'(c) > 0 and /"(c) = 0 

(e) /'(c) < 0 and /"(c) = 0 

17. Use the second derivative test to verify that the value 

pa \ r( a - l) |- 1/a 

a — I 1 dp i 

obtained in Example 4 of Section 12.8 is actually a minimum value. 


12.9 HIGHER-ORDER DERIVATIVES / 353 



There are two cases to consider 

(i) If ’i is even, the argument is the same as for n = 2 In this case 
(6 — a) n /«' is positive so that everything depends on / <n) (c) If 
f {r) (a ) > 0, so that f M (c) > 0 also, then a is a point of relative min- 
imum If < 0, then a is a relative maximum 
(li) If n is odd, then ( b — a) n is positive when b > a, but negative when 
b < a Hence, whether / ln> (a) is positive or negative, there are points 
near a at which /(&)>/ (a) and other points at which / (6) < fifl) 
The point a is neither maximum nor minimum 

FIGURE 36 


/ 



FIGURE 37 
/ 



352 / DIFFERENTIAL CALCUIUS 



DIFFERENCE EQUATIONS 13 


13.1 CONCEPT OF A DIFFERENCE 

The data with which a behavioral or management scientist works is ordinarily 
discrete in nature That is, it takes the form of a sequence of values, usually 
obtained at different points m time Economic data, for instance, is typically 
available through periodic reports, daily, monthly, yearly, or in the case of 
the census every ten years In psychology and sociology, experiments often 
proceed through a sequence of trials, the experimental data consisting of 
various measures associated with trial outcomes 

Two different points of view can be adopted for the analysis of such data 
In the first point of view, it may be assumed that the underlying process 
operates continuously m time and that the experimental data constitute a 
sampling of the process at isolated time points In this case, a typical model 
describes the process by continuous functions, the derivatives of which 
mirror various changes associated with the process Thus calculus techniques 
predominate in the analysis of the data Examples 3-5 of Section 12 1, 
Example 4 of Section 12 7, and Examples 4 and 5 of Section 12 8 provide 
typical illustrations 

The second point of view is that the observed data constitute the entire 
process In this case, the data appear at isolated time points rather than 
throughout an interval so that techniques of calculus are inappropriate and 
one works directly with the data sequence A useful method for doing this is 
the method of differences and of difference equations , the subjects of this 
chapter 

Let us assume that there is a function /whose value f(ii) at time (or trial) 
n provides a numerical measure of the experimental outcome for that time 
period The change in the process between time n and time n 4- 1 »s then 


131 CONCEPT OF A DIFFERENCE / 355 



18 . Suppose we have the cost function C(*) = ax * -f- bx 2 -f cx + d, 
where, m order to make economic sense, we assume d > 0 and require 
that C be monotomcally increasing for x > 0 What restrictions do these 
conditions impose on the coefficients 0 

19. A monopolist has respective revenue and cost functions/? (x) = — ax 2 + 
/3x and C(jc) = ax 2 + bx c, where all constants are assumed posi- 
tive The government imposes a tax at the rate of t dollars per unit out- 
put Find the tax rate t which will maximize the tax revenue function 
T(x) — txt, where x t is the output which maximizes profit after imposi- 
tion of the tax at rate t 

20. Prove that the function fix) == jc j -f ax 2 -f bx -f- c has neither a rela 
tive maximum nor a relative minimum if and only if a 2 < 3 b 


SUPPLEMENTARY READING 

Stem, S K , Calculus in the First Three Dimensions (McGraw-Hill Book Company, 
New York, 1967), Chapters 2-5, 10, 11, and 21-23 

Thomas, G B , Calculus and Analytic Geometry, 4th ed (Addison-Wesley Pub- 
lishing Company, Reading, Massachusetts, 1968), Chapters 1-4 and 7 


354 / D1FFE 


RENTIAL CALCULUS 



the third difference is the difference of the second difference: 

A 3 /(h) = A[A 2 /(n)] = A 2 /(„ + 1) - A 2 / ( 11 ) 
and, in general, the Mh difTerencc 

A V(«) = A[A i-, /(«)] (3) 

is the difference of the (fc — l)st difference. 

Example 3 For the function g m Example 2, we find 
A 2 g(«) * A[Ag(n)] 

* A[4« - 1] 

- [4(/i + 1) - 1J - [4/i - 1] 

= 4 
and 

A 3 g(/i) = A[A 2 g(n)] 

= A 2 g(/i + 1) - A~g(n) 

=4-4=0 

for every value of n. Then A *g(n) — 0, A 5 g(/i) = 0, etc. In fact, for this 
example A l g(ri) is 0 for every value of n whenever k is greater than 2. 

A part of our results are exhibited in Table I, called a difference table. 
Of course, the values of n may be extended indefinitely in each direction ► 

TABLE 1. Difference table for g{n) = 2n 2 — 3n -f- 2 

n g{n ) Ag(n) A 7 g(n) A 2 g{n) A 4 g(n) 

-2 16 

-9 

-1 7 

-5 

0 2 

-1 

1 1 

3 

2 4 

7 

3 11 

11 

4 22 



13.1 CONCEPT OF A DIFFERENCE / 357 




represented by the /?/ st diffeience 

A/00 = /(« + 0 -/(») 

which is positive or negative depending on whether there was an increase or 
a decrease in the experimental results 


Example 1 A typical economic example is the cumulative growth of a 
sum of money at compound interest An amount A o invested at 100/% 
compounded annually grows to an amount A„ after n years The amount 
A , +1 after is + 1 years is related to A n by 

A n +1 = (1 + lH 

The difference 

AA n = A n+ i - A n = i A n W 

represents the accumulated interest on the amount A n during the period 
from time n to time n + 1 

Because it involves the difference A A n — A n+ \ — A n , Equation (l) I s 
called a difference equation The solution of this equation, as we saw in 
Problem 3 of Section 5 2 is 


An = (1 d - i) n Ao (2) 

Methods for solving difference equations will be investigated in detail in 
Sections 13 4-13 6 ► 


Example 2 lfg(/i) = 2n 2 — 3n + 2, the difference between the functional 
values at ii = 0 and n = 1 is 

Ag(0) = g(l) - g(0) = 1 - 2 = - 1 

Similarly, the change between /i = l and n = 2 is 

Ag(l) - g(2) - g(l) = 4-1 = 3 

In general, for any value of n, the first difference function Ag evaluated at 
m is 

Ag(») = g(n + 1) - g(ii) 

= 2(/i -f l) 2 - 3 (n + 1) + 2 - [2/i 2 - 3/i + 2] 

= 4/1 — 1 p- 

Higher differences of functions arc defined successively, as were derivatives 
Thus the second difference A 2 / of/ is the difference of the first difference 

A 2 /(m) = A[A/(/j)J = A /(„ + 1) _ a /(/!) 


35S / DIFFERENCE EQUATIONS 



Repeated use of Equation (3) allows us to write any difference in terms of 
values of / done For example, 

A 2 /(ii) = A(A/(n)) = A/(n + 1) - A/00 

= [/0‘ + 2) -/(« + 1)] - [fin + 1) -fin)] 

= fin + 2) - 2/(u + 1) + fin) < 4 > 

and 

A 3 /(«) = A(A 2 / (n» = A 2 /(n + 1) - A 2 /(/t) 

= l/(« + 3) - 2/(n + 2) +/(« + 1)3 
- [/(» + 2) - 2/(n + 1) +/(»»)] 

= /(« + 3) - 3 f(n + 2) + 3/(n + 1) - /(«) (5) 

Other formulas are derived in the exercises but will not be required in the 
text 


Example 4 Differences may be given physical interpretations If, f° r 
example, /(t) denotes the distance traveled by an automobile in time t, then 
4/(0 =/(*+ 1 ) — fit) represents the change in distance in the unit of 
time from t to t + 1 and as such gives the average velocity over this time 
interval (compare with Section 12 1) 

When Patrolman Smith checks your speed by stretching two cables across 
the road, he computes such an average velocity However, in this case the 
elapsed time might be any amount h rather than 1 unit The computation 
for average speed becomes 

W(0 _ /(/ + h) -fit) 
h h 

the quantity A h f{t) being the first difference of / over the interval from t 
to t + h w 


Example 5 In Harrod’s* classic one country model of economic growth, 
he postulates that 

(0 A country's net savings are proportional to net income 

(u) A country’s desired investment is proportional to its increase in 
income 


Thus, denoting national income, net savings, and desired investment during 
time period t by Y(t), S(t), and /(r), respectively, he writes 
(0 S(0 - cT(0 


(*» KO - g[Y(t) - Y(t - I)] « g AT(/ - 1) 

'ppwS 10 113110101 W J * Economie Dinanuc, (Macmillan, lnc , New York, 1959), 


353 / DIFFERENCE EQUATIONS 



where c and g are positive constants representing the propensities to save 
and to invest 

Harrod assumes that net investment is realized only if S(t ) = /(/), m 
which case (i) and ( 11 ) yield the result that 

A Y(t - 1) = Y(r) - Y(t - 1) = - Y(t ) 

The amount of growth being proportional to total income, it follows that 
national income must grow by ever increasing amounts ^ 

PROBLEMS 

In Problems 1-9, find (a) A>(0), A>(1), A_> (2), Ay (3), Ay(n), and (b) A 2 y (0), 
A 2 y(l), A 2 y(2), A 2 >(3), A 2 } (/;> 

1, y{n) — 2 2. y(ri) = « + 1 3. y(ri) => 5n -J- 2 

4. >(n) = nr 5. y{n) = n(n — 1) 6. >(«) = n 2 — In + 1 

7. y(n) = - 8. >(/i) = 2“ 9. >(«) = ~ 

it 3" 

10 Construct a difference table through the fourth difference for g(n) = n z 
using n = 1, 2, ,6 

11. Find general formulas for A 4 /(«) and A ’/(«) (See Equations (4) and 

(5)) 

12. By induction, prove that the general difference formula is 

A'/(n) = t + A) 

I////JZ Use the recursive definition A k f(n) - A (A*- */(«)) ] 

13. Verify that 

(a) A 2 /; 2 - 2 (b) A V/ 1 = 321 (c) Ahi 1 = 4 3 2 1 

14. Rurall from Section 7 2 that (/i) r = n(n — l)(n — 2) (/« — r + 1) 

Prove that 

A(n), = /•(/!),_ i 

15. Find A‘(n) , for k - 2, 3, 4, 5, and 6 

16. Show that 

(a) A a" = ci*(a — 1) 

(b) A2- = 2" 

(c) A sin cm = 2 sin ^cos«(« -f f) 

(d) A cos an « — 2 sm ^ sin a(n -f * ) 

(O A In n *■ In 

17. If/f/i) 1 " is dcim<.d as/(n) ’* ** /(/») /(n — 1) f{n — r ^ I), \crif> 
that A(a t hn)’ « bria ■*- In i) '“** 


i3i cour 


OF A DIFFERENCE / J53 



18 Find the following higher order differences 
(a) ( b ) A' 2 " 

13 2 BASIC PROPERTIES OF DIFFERENCES 

In this section we discuss certain properties of differences which will be of 
importance in the solution of difference equations 

Property 1 Operations of taking differences commute with one another 
Symbolically, for all non negative integers A. and r, 

A l [A r /(n)] = A r [A*/ (»)] = A k+r f(n) 

PROOF The truth of this assertion is apparent from the recursive nature of 
Equation (3) In order to obtain the (A. + r)th difference of / we may first 
compute i differences and then k more, or first k and then r more ► 

Property 2 If c is any constant and f a function of it, then 

AHc/(ii)] = cA k f 00 (®) 

That is the operations of differencing and multiplying by a constant copimute 
PROOF The result holds for k = 1 since 

A'W(n)l - A[c/(n)] = cf(n + 1) - c/(«) 

= cl/(n + 1) - /(/>)] => cA/(n) (7) 

If (6) is true for k = m then 

A m+l [cf(it)] = A[A cf (/i)] 

= A[cA"/(>i)] [by the inductive assumption] 

= cA[A m /(/l)] [using (7)] 

— cA + f(ji) [from (3)] ^ 

Property 2 does not hold if c is not constant For instance, 

AW(»)I = (ft + 1 )/(», + 1) + »/(„) 

= hA/(„) + /(„ + 1) 

Property 3 The operation of taking differences distributes over sums of 
functions In symbols, if 

g(») = /i 00 + f 2 (n) + + / r („) 


360 / DIFFERENCE EQUATIONS 



then 


A'-gQt) = A 1 /,(h) + A y 2 («) H + A y r (rt) (8) 

for every positive integer k. (Loosely speaking, this says that the difference 
of a sum equals the sum of the differences. Compare this statement with 
Theorem 2(c) in Section 12.2.) 

PROOF It is convenient to use summation notation. Since, 

goo = 2 / j (") 


Ag(n) = A [2 />(«)] = 2 /,(« + D - 2 /;(“) 

Lj=i J j-t j=-i 

= 2 [/)(« + i) - /»] 

1 

= i: a/x«) o) 

j-i 

proving that (8) is true for first differences. 

As an inductive assumption suppose that (8) is true for k = m. Then 

A m+1 g('0 = A m+1 [ 2 /,(«)] 

= A J2 A”*/j(nj} 

- 2 A[A”/)(n)l 

J=1 

- 2 A“ +, / J (n) 

J=l 

completing the proof. ► 

Example 1 In their studies of the sense of time in humans, McGrath and 
O’Hanlon* found that a person’s subjective estimate £(/) of time is not, in 
general, equal to the elapsed time / but is related by a function of the form 

£(/) - a + bt 

'McGrath, J. J., and O’Hanlon, J. F., Jr., “Methods for Measuring the Rate of Subjec- 
tive Time,” Perceptual and . Motor Skills 24 (3, Part 2), 1235-12-40 (1967). The authors 
actually me the derivative of E(t) rather than a ratio of differences to define RST. 


[inductive assumption] 
[using (9)] 


13.2 BASIC PROPERTIES OF DIFFERENCES / 361 



constants The rate of change RST of subjective time is 


where a and b are 
then defined as 

= 4(g +_M 

- At At 

_ Afl + A jbt) 

" At 
_ Aa + bAt 
At 

Since Aa - a — a = 0 and At = t + 1 — 


[Property 3] 

[Property 2] 

= 1, we find RST ~ b ► 


Example 2 Suppose that when x units of a certain commodity are 
produced, the unit production cost is 

U(x) ~ a — b\ 

where a and b are constants Then the total production cost is 
C(x) = x (/(x) = ax — bx 2 

In terms of differences, marginal cost Mix) is the change in total production 
costs resulting from a unit increase in production from x to x + 1 units 
Here 

Af(x) = AC(x) - A (ax — bx 2 ) 

= aAx — bAx 2 [Properties 2 and 3] 

Since Ax = (x -f 1) — x = 1 and Ax 2 = (x + l) 2 — x 2 = 2x + 1, w e 
have 

Mix) = a - 6(2x + 1) = a - b - Ibx 
(Compare this example with Example 1 of Section 12 1) ► 


We complete our list of properties of differences by introducing a math- 
ematical shorthand definition which allows us to distribute a function over 
a “sum of differences 


Property 4 In an expression of the form 

(HA l /(n) + o i _ 1 &‘- | /(n)+ + a,A/(«) + a„/(n) (10) 

where a 0 , a,, <j 2 , , a„ are constants, we treat /(«) as though it could be 

factored out and we rewrite (10) as 

[ruA* + or-rA*-' + + Ql A + a„]/(n) (11) ► 

The expression 

M A ) = + Ut_iA t_l + +0^ + 00 

appearing in (11) is called a polynomial difference opetator of degree k 
The definition, or meaning, of <fo(A) is specified through its operation on a 

362 / DIFFERENCE EQUATIONS 



function in the manner indicated by the equivalence of (10) and (11) 
Specifically, <&(A) operates in such a way as to change a function /(«) into 
another function $*(A)/(«), which is a linear combination of differences of 
f(n) From the general point of view of Section 4 1, <fo(A) is itself a function 
whose domain is a set of functions To avoid confusion, the word operator 
is used in mathematics for such functions 
In Theorem 1 of Section 1 4, we discovered rules for operating with sets 
which were quite similar to certain algebraic operations with numbers, 
namely, the commutative, associative, and distributive laws In the same 
way. Properties 1-4 above constitute rules for operating with differences 
which have direct analogies to ordinary algebraic operations with numbers 
Specifically, Property 1 shows that repeated operations of taking differences 
combine as though the symbol A were an algebraic quantity being raised to 
powers (Of course, this is not what is happening, but the symbols combine 
as though it is ) Property 2 states a commutative law for A and its “powers,” 
while Properties 3 and 4 provide distributive laws The net result is that 
differences combine with each other and with constants according to precisely 
the same laws that govern ordinary numbers Numbers and differences are 
said to exhibit the same mathematical structure 

Example 3 The repeated operation of taking differences indicated in 

(A + a)[( A + <0/001 

may be rewritten without calculations as 

[A 2 + 2 aA -f a 2 ]f(n) 

by simply “multiplying” the indicated “product” (A -f- a)( A -F a) A formal 
proof that this is the case may be obtained from Properties 1-4 as follows 

(A + o)[(A -P a) f{u)] = (A + tf)[A/(«) + af («)] [Property 4] 

= A[A / (u) + af (/»)] + a\Af 00 + af (/?)] [Property 4] 

= A 2 / (/i) + A[a/ («)] + aAf 00 + a 2 f (») [Property 3] 

= A 2 f(n) -f 2a Af (n) -f* a 2 fin) [Property 2] 

— (A 2 -f 2a A + a 2 ) fQi) [Property 4] 

We shall take advantage of this correspondence between the properties of 
differences and numbers by writing (A -F a) 2 fin) in place of 

(A + a)[(A + <0/001 

Similarly, 

(A + a){(A -F b)[(A + <0/001/ 

may be rewntten as 

(A -F b)i A -F a) 2 fin) 

Other examples will appear in the exercises and m succeeding sections ► 

132 BASIC PROPERTIES OF DIFFERENCES / 363 



Wc conclude this discussion by summarizing the most important results 
concerning polynomial difference operators No proofs are necessary beyond 
citing the correspondence already noted between differences and real numbers 
and observing that each of these results would be valid if A were a number 
■and the operation were multiplication rather than differencing 

Property 5 Let foCA) and i£ m (A) be any two polynomial difference 
operators and let c be a constant Then 

0) «t(A)[c/00] = c*r(A)/(n) 

(it) <fc(A)[/00 + g('0] = *t(A )/(n) + <H(A)g(/l) 

(m) fe(A) + *„(A)]/(iO = <H(A)/00 + Vsi(A)/(ll) 

(iv) *i(AW.(A)/(lO] = ’/'m(A)[0j.(A)/(n)] = l* t (A) X lMA)]/(«) 

The symbol X denotes ordinary multiplication of polynomials ► 

Example 4 Since 

(A - 1)2’ = A2” - 2" = (2" + 1 - 2") - 2" = 0 
(A - 2)3" = A3" - 2 3" = (3” +1 - 3") - 2 3" - 0 
it follows that for any constants a and 6, 

(A 3 - 3A + 2)[o 2" + b 3“] = 0 

For, 

(A 3 - 3A + 2)[a 2" + b 3"] 

= (A - 1)(A - 2)[rr 2" + b 3”] 

= (A - 1)(A - 2)1(1 2"] + (A - 1)(A - 2)[b 3"] (Property 5(a)] 

= a(A - 2)[(A - 1)2") + 6(A - l)[(A - 2)3") [Properties 5(r) and (iv)J 

= n(A - 2)[0] + 6( A - l)[OJ 
= 0 + 0 = 0 

PROBLEMS 

1 In Example 5 Section 13 1, modify HarrocTs approach by assuming that 
'« “ s[T(0 - F(r - 1)1 + * y(,) + L 
where t and L are constants Find A/(r), A ’/(,), omJ , ei „ cral> A ,, M 
2. Suppose that population size at lime / is gjven by 
mt) =* No(l + a-0y 

Where IS the initial population sire and a and ff are constant birth 
war i 11 raUS ' rcsIKC "' lly F,ni1 AN(r), A 3 JV(r), and, m general, 


364 / DIFFERENCE EQUATIONS 



3. Verify for any function K(«) that 

(a) y(« + i) = y(«) + Ay(»> 

(b) y(« + 2) = y(,i) + 2A Y(n) + A 2 y(/i) 

(c) YQi + 3) = y(n) + 3Ay(») + 3A 2 Y(n) + A 3 y(n) 

4. Prove by induction that for the generai function Y(n -f r) with r any 
positive integer 

YQt + r) » (1 + Am«) 

Hence, prove that 

Y(n + r) = £ Q A l Y(„) 

5. Verify the following formulas for differencing products and quotients 
(Compare with Theorems 3 and 5 in Section 12 2) 

(a) A [/(«) g(jt)\ = g(n)Af(rt) 4~ l)Ag(«) 

not zero 

6. If (n), = u(;i — 1) (/i — r + 1), prove that 

(a) n 2 = 00 2 + (n)i 

(b) n* — OO3 + 3(n)2 4- (n)i 

(c) « 4 = («) 4 4- 600a + 7 (w ) 2 4- 00 1 

7. Extending Problem 6 show by induction that for any integer r, n T can 
be written as a linear combination of the form 

n r = cr,(n) r 4- Or-i(n ),— 1 + + «i(«)i (12) 

where a 1, , a T are constants 

[Hints n r+1 = n n T , n = (n — A) 4- A, and (/i — A) 00* = 00* + i 1 

8. If Y(n) ~ n 4 , find A P00 

(a) from the definition of A 

(b) by first expressing n 4 as m Problem 6(c) and then using the result of 
Problem 14 m Section 13 1 

(c) by writing n 4 = n 3 n and using Problem 5(a) 

9. Find the marginal cost function if when /» units are produced, the unit 
cost is a — b~ cn , where a, b t and c are constants (Sec Example 2 of this 
section ) 

10. Elasticity of demand is defined as the ratio of the percent change in 
quantity demanded to the percent change in price Let P(t) denote the 
unit price at time t, and let Q(t) be the number of units demanded at 
timer If P(() - a ~ 6(7(0, show that the elasticity of demand is 

100 X A (?(/)/(?(;) __ I £0) 

100 X APCO/PO) ” l Qit) 


132 BASIC PROPERTIES OF DIFFERENCES / 365 



11 . in Problem 10, find the first difference of elasticity of demand 

12, If Pit) <2(0 * A, where A. is a constant, show that the elasticity of 
demand is — Qit + 1)/Q(0 

13 Prove the results concerning polynomial difference operators stated in 
Property 5 


13 3 APPLICATIONS 

Let us recall from Section 11 2 that a polynomial is a function/ of the form 
fin ) = 00 + a i n + + + ol— in 1 ”" 1 + Gi.n* 0 3 ) 

where a 0i a u a 2 > , are constants and A is a non-negative integer If 

a k s* 4 0,/is a polynomial of degree k Some polynomials are given special 
names Thus, the function f whose values are given by fin) = 3/t 4 + 7/1 — 2 
is a polynomial of degree four or quartic, while g(«) = 6 — it 3 is a third 
degree, or cubic, polynomial Polynomials of degrees zero, one, and two are 
called constant, linear, and quadratic functions, respectively 
The first difference of a constant function is zero, since if fin) = c for 
each value of n, then A fin) — fin + 1) — fin) = c — c — 0 If /(/i) = 
cn + b, a linear function, then A/(n) = [c(n + 1) + b] — [ cn + b] = c, 
a constant function Similarly, the first difference of the quadratic function 
fin) = an 2 + bn + c is 

4 fin) = fin + 1) - fin) = a{n + l) 2 + 6(/i + 1) + c - [an 2 + bn + c] 
= 2an -f (a + b) 

which is a linear function These results are generalized in the following 
theorem (which should be compared with Theorem 1, Section 12 2) 

Theorem 1 The first difference of a polynomial of degree k > 1 is a 
polynomial of degree k — 1 

PROOF It suffices to prove the theorem for the special case fin) = n*', 
that is, to prove that differencing reduces by one the degree of each term in a 
polynomial In this case we have 

4 fin) = fin + 1) - fin) = ( R + l) fc - n L 
Using the binomial expansion (Section 7 3), 

Af(n) - jn* + kn*- 1 + n L 2 + + kn + 1 j - n* 

= kn L l + „ 1 - 2 + + kn + I 

which establishes the result ^ 

Since each operation of taking differences drops the degree by one, it 
follows that the rth difference of a polynomial of degree k > r is a polynomial 
of degree k — r If k < r, the rth difference is zero 


366 / DIFFERENCE EQUATIONS 



Example 1 If the output of a production system for time period t is given 
by the output function 

Pit ) = 5/ 1 - 2 r 4-10/4-2 

then AP(/) « 15* 2 4-11/4- 13, A 2 /^/) ~ 3 Ot + 26, and A 3 P(t) = 30 
Obviously, A a jP(/) = 0 for all A. > 3 ^ 

In most applications of differences it is very important to be able to 
reverse the operation of taking differences in order to be able to find functions 
whose differences are given If, for example, we are asked to find a function 
/for which A/(«) = 0, we know that /(«) = c is such a function Again, 
ifA/(n) — 4, then any linear function of the form f(n) ~ An + c is an answer 
to the problem The next theorem generalizes these results to polynomials 
of any degree, the proof is omitted (but see Problem 8) 

Theorem 2 Every polynomial is the first difference of some other poly- 
nomial Specifically, if g(/i) is a polynomial of degree r, then there is a 
polynomial /(«) of degree r + 1 such that A/(ji) = g(n) ► 

One application of this theorem is the following method of evaluating 
sums Let g be any function and consider the sum S = Ag(&) where 

m < n Since Ag(k) — g(k l) — g(k), S may be rewritten as 

S = [g(m 1) - g(m)] + [g(m + 2) - g(m 4- 1)] + 

+ [g(») - g(n - 1)] + [g(« + 1) - gOOl 
All terms cancel except two, leaving 

5 = g(n + I) - g(m) 

Once the function g is known the sum S may easily be computed 
We use this result as follows Suppose we wish to evaluate the sum 
f(k), where / is a given function If we can find a function g whose 
first difference A g is the same as /, our task is greatly simplified For then, 

iLf(k ) = S Ag(k) = gin + 1) - g(m) (14) 

km k=n 

as above 

Example 2 If we wish to find the sum 5 = k of the first n positive 

integers, our problem is solved if we can find a function g such that 

A g(A) = k (15) 

Such a function exists according to Theorem 2 and, in fact, is a polynomial 
of the form 

g(A) = ak 2 + bk + c ( 16 ) 


133 APPLICATIONS / 367 



Taking differences in (16) and imposing condition (15) leads to 
A g(fc) = 2ak + a + b — k 

In order for this to hold for all k, we choose a = i and b = so that 
g(fc) = 2 k 2 — \k 

(The value of c is irrelevant, so choose c - 0 ) Inserting this into Equa- 
tion (14) gives 

S = £ k = g(n + 0 — gO) 

— |( n + D* — JO* + 0 — — 1 n 

= *n(n + 1) 

For instance, 1 + 2=3= 2 (2) (3), 1 + 2 + 3 = 6 = K3)(4)» 
1 4. 2 + 3 + + 100 = |(100)(101) = 5050, and so forth (Compare 

with Problem 6 in Section 5 2) ^ 


Example 3 Similar techniques may be applied to find sums of values 
of functions other than polynomials For instance, to find the geometric sum 
h( x, 11 ) = * fe , we note that if g(k) = x k /(x — 1), then 


Thus 


Ag(*) ■ 



Ax ~ 1) = 

x — 1 


/i(x, n) = g(ji + 1) - g(0) 


jc n+1 _ x° 
x — I X — I 


X 


1+1 - 1 


X - I 


a result familiar from our discussion in Section 6 2 concerning partial sums 
of the geometric series ^ 


Wc can also apply a difference method to the problem of determining the 
maximum and minimum values of a function in a given interval, which we 
have previously treated by the methods of calculus (Section 12 8) When, 
for a function/, A /(n) = /(» + 1) - /(«) is positive, it means that /(« + 1) 
cxcccds/(n), or that the value of/has increased as we move from n to n + 1 
Similarly, if A /(«) < 0, then/(n + 1) </(»), indicating a decrease in the 
value of / 

Thus, if we wished to find the point at which /attains its maximum, or 
largest value (assuming such a point exists), we would look for a value n for 
which 

A/(« — 1) > 0 and A/(/j) < 0 (17) 


368 / DIFFERENCE EQUATIONS 



For if (17) holds, we have both 

/(«) >/(« ~ 1) Jl'd /(») >/(/! + 1) 

so that, at least relative to points on either side,/ attains its largest value at n 
That is, « is a point of relative maximum 


Example 4 In a monopolistic market, the monopolist may adjust his 
output and price to obtain maximum profits Let us suppose that the per unit 
price p and the output n are related by p = 10 — (I/1000)/i, so that the total 
revenue is 

R(n) = n p — lOu — toQo 1 ' 2 
Then if the total cost function is 

C(n) = 8 + 4/i + 5 ho 1 * 2 

the profit obtained from the production of n items is 

P(n) = R(n) - C(») = 6/i - n&o' 1 * ~ 8 

In order to find the output which results in maximum profit, we look for 
that value m of n for which AP(in — I) > 0 and APQu) < 0 Now 

AP(n) — 6 — x + 1) 

so AP(n) > 0 when 

6 > ns 3 ra(2// -F 1) 

and A P(n) < 0 when 

6 < *F 0 

These two cases yield m = 1000 as the number of items which should be 
produced The maximum profit is 

PO 000) = 6(1000) - t^ooCIOOO/ - 8 

= 2992 ► 


PROBLEMS 

1 Extend the result oF Example 2 by showing that 
_ n( n -F l )(2n -f 1) 


(a) £ k- = - 


6 


(b) £ = «i!L±i2Z 


(c) X (2k ~ 1) >= n- 
t=l 

2 Use the results of Problem 1 to hnd 
10 

X (A 1 - 6k- + 7) 


13 3 APPLICATIONS / 369 



3. The total cost to the Nu-U Cosmetic Company for the production of n 
units is 

T(n) = 10,000 + 400« - In 2 + i« 3 

(a) Find the marginal cost (see Example 2, Section 13 2) for the tenth 
unit 

(b) When is the marginal cost a minimum and what is the minimal 
marginal cost 0 

(c) What is the minimum total cost 0 Is total cost minimized at the same 
point as marginal cost 0 

4. If Y(n) m n°, find AY(n) 

(a) from the definition AY(/i) = Y(n + 1) — Y(ri) 

(b) by writing n* = n i n 2 and using the result of Problem 5(a), 
Section 13 2 

5. Find a function l'(f) for which 

(a) A Y(/) = Y(t + 1) - Y(0 = t 2 + 3/ + 2 

(b) A Y(0 - 2/ + 3 

(c) A K(0 = 2 2‘ + 1 

6. Prove that if Y(n ) is the rth-degree polynomial 

Y(n ) = OrtY + a r _i/T _l + + a in + ao 

then A r Y(rt) = a, r' and all succeeding differences are zero 

7. If when n batches of an item are produced, the per unit cost is 

C(/») = n 2 — 6/i + 10 

show that the unit costs are minimized when three batches are produced 

8. Combine the results of Problem 14, Section 13 1 , and Problem 7, Section 
13 2, to prove Theorem 2 

13 4 THE SIMPLEST DIFFERENCE EQUATIONS 

Equations involving differences of functions arise in certain mathematical 
models employed in the behavioral sciences Such equations naturally arc 
called difference equations To sohe a difference equation is to find a function 
which, when substituted into the equation, yields an identity — an expression 
valid for all values of the variable As a simple example, the equation 
A/(«) =2 /i+ 1 has the solution /(«) = u 2 q- 3, since 

A(/r + 3) = (n + l) 2 + 3 - [ir + 3] = 2/i + 1 
holds regardless of the value of n 

Wc shall limit our general remarks to linear difference equations , these 
being the simplest and the type most often encountered in practice A linear 


370 / DIFFERENCE EQUATIONS 



difference equation has the form 

0(A)/(») = S(») (18) 

where «#>(A) is a polynomial difference operator of the type introduced in 
Equations (10) and (11) in Section 13 2, g is a known function, and / is the 
function for which a solution is sought Equation (18) is called linear because 
the function / and its differences appear to the first power As examples, the 
equation A 2 /(«) — 3A/(n) = 2/i 3 is linear, while A/(u) + 2 [/(»)] 2 = 0 is 
not linear 

The oidet of the linear difference equation (18) is defined to be the degree 
of the polynomial difference operator <£(A) For instance, A/(;i) = 3« 2 is a 
first order equation, while A 4 /(n) + 6A 2 /(n) ~f(n) = 2 is of fourth order 
Solutions to the simplest linear difference equations may be easily obtained 
from our prior knowledge of differences We know, for example, that the 
first difference of a constant function is zero (Section 13 3) Hence the 
equation A f ( n ) = 0 has the solution / (;i) = c, where c may be any constant 
For this equation there are no other solutions since 

A/(#i) -/(« + 0 ~/(«) ** 0 

means that /(n + 1) = /(n) for all values of n It follows that all functional 
values are the same and we have proved the following theorem 

Theorem 3 A function / is a solution of A f (n) — 0 if and only if / is a 
constant function ► 

An immediate consequence of Theorem 3, and a key to solving difference 
equations, is the next theorem 

Theorem 4 Two functions which have the same difference function can 
differ at most by a constant amount That is, if A /(») = A g(u), then there is 
a constant c (which may be zero) such that for all values of /(«) = g(u) 4- c 

PROOF Assume A /(«) = Ag(n) for all values of n and define a new function 
h by />(«) = /(/;) — g(jj) Then A/;(«) = A /(«) — Ag(«) = 0 and it follows 
from Theorem 3 that h must be a constant function This completes the 
proof ► 

A major concern in solving any difference equation is to make sure, as we 
were able to do m Theorem 3, that all solutions have been found For 
instance, any constant function /(«) = c is a solution of the equation 

A 2 /(«) = 0 (19) 

But there arc other solutions, such as /(«) =* 6/i and f(n) = 3 — 2 «, as 
may easily be \erified To find all possible solutions of the difference equation 
(19), we first substitute g(/i) ** A/(w), obtaining the equivalent form 
AgOO = 0 According to Theorem 3, the only solutions of this latter equation 


134 THE SIMPLEST DIFFERENCE EQUATIONS / 371 



are constants, so 
solving 


that the task of solving (19) may be replaced by that of 


g(n) = A/(n) = c (20) 

One solution of (20) is h(n) = cn, since A (m) = c(n + 1 ) - cn = c 
lf/(n) is any other solution, that is, if A f(ji) = c also, then it follows from 
Theorem 4 that there is a constant b such that/(«) = cn 4- b for all values 
of n In short, we have proved the following theorem 

Theorem 5 Every solution of the equation A 2 f(n) = 0 has the form 
f(ji) =s cn + 6, where b and c are constants ► 


Example 1 Suppose* that reaction time y is related to the number n of 
units of information m a choice situation having 2" alternatives by the 
equation 

y{n + 2) = 2 >(/? + 1) - >00 


Then 


>(« + 2) - 2y(it + 1) + >00 = A 2 >00 = 0 

and it follows from Theorem 5 that >(/i) = a + bn Hyman’s results show 
that > is approximately equal to 0 2 when n = 0 and is about 0 8 when 
;j = 4 Hence 

y(0) = 0 2 = a + 6(0) 

>(4) = 0 8 = a + 6(4) 

from which we obtain a = 0 2, 6 = 0 15, and >(n) =02 + 0 15/? ► 


In order to evaluate the constants a and 6 in Example 1, it was necessary 
to know the values of > corresponding to two different values of n Functional 
values, such as>(0) = 02 and >(4) = 0 8 in Example 1, which are specified 
in order to obtain a unique solution to a difference equation, are said to 
constitute the initial conditions on the problem 

In the proof of Theorem 5 the solution of the second order equation (19) 
was facilitated by replacing it by the first-order equation (20) A similar 
reduction of order enables us to solve higher-order equations and leads to the 
following theorem 

Theorem 6 Let k be any positive integer A function /is a solution of the 
equation A f(ii) = 0 if and only if f is a polynomial of degree k — 1 or less, 
that is, if and only if/ has the form 

f(n)= c 0 + cm + an 2 + +c*_ 1 /? fc “ l 

where c 0 , Ci, . , c*_j are constants 

•Adapted from H>man, R , Stimulus Information as a Determinant of Reaction Time, * 
Journal of Experimental Psychology 45, 188-196 (1953) 


372 / DIFFERENCE EQUATIONS 



PROOF Theorems 3 and 5 have established this result for k = 1 and 
k — 2 Suppose, as an inductive assumption, that the theorem is true for 
k = r That is, suppose that all solutions of A r /(/i) = 0 have the form 
/(«) = b Q 4- bin -f- + where b 0 , b it > H_i are constants 

To solve the (i + l)st order equation A T+l f(n) — 0, we first substitute 
g(ii) — A/00, reducing the order and obtaining A r g(n) = 0 The inductive 
assumption then states that g(/i) must have the form 

gQO = 4/X'O = b Q 4- bin + + (21) 

Adapted to our present needs. Theorem 2 guarantees that there is a poly- 
nomial h(ii) = c 0 4 * 4- c 2 n 2 4- + c r n T with the property that 

A/i(«) = b 0 4- bin 4- 4- That is, h is a solution of (21) 

But any other solution /(«) is such that Af (n) = A //(/?) It follows from 
Theorem 4 that there is a constant c such that / (n) = h{n) + c, whence f 
itself must be an rth degree polynomial This completes the proof ► 


Example 2 Let t denote time, in years, after the year 1779 and suppose 
that S(t ), the number of scientists m the United States at time t, is governed 
by the difference equation A 3 5(t) = 0 Suppose further that the numbers 
of U S citizens classified as scientists in the years 1779, 1780, and 1781 were, 
respectively, 5(0) = 300, 5( 1) = 310, and 5( 2) = 345 
From Theorem 5, we know that 5 has the form 5(0 = a + bt 4- ct 2 , 
where a, b, and c are constants to be determined from the specified values of 
S Substituting i = 0, 1, and 2 we obtain 


5(0) = a 4* HO) 4* c(0 2 ) = a = 300 

5(1) = a 4~ b(l) 4- c(l 2 ) = a + b + c=3\ 0 

5(2) = a4- 6(2) 4- e(2 2 ) = a + 2b 4- 4c = 345 

Solving these equations for a , b, and c yields a = 300, b = —25, and 
c = 12 5 so that s( ,j = 12 5 1 2 - 25/ + 300 


According to this equation the number of U S scientists in the year 1979 
is predicted to be 

5(200) = 12 5(200) 2 - 25(200) + 300 = 499,800 ► 


PROBLEMS 

1. Which of the following are lineir difference equations 0 What is the 
order of those equations which are linear 0 (The letter c denotes a 
constant ) 

( i) A; 00 + 2} 09 = 0 

(b) A> 00 + («) = 2 

(c) ) 00 A 4 00 - 1 

(d) A b 00 — A 2 } (rt) -f c - 6/t 

(e) [A) OOP 2 + >(> ») « « J 


134 THE SIMPLEST DIFFERENCE EQUATIONS / 373 



2 In each case, verify that the given function is a solution of the difference 
equation (The letters a, b, and c denote constants ) 

(a) yin + 1) - y(") “ 0, y(") “ 1 

(b) y(n + 1) - yin) = 1 . }>(<!) = n + c 

(c) fin + 1) - /(") “ n, fin) = n(n - l)/2 

(d) A/(n) = 3n(n + I)/2, fin) - [n(n- - l)/2] + c 

(e) bin + 1) - 2/i (n) = 0, /i(n) = 2" 

(0 //(<! + 1) - 2/i(«) = 0, hiri) - c 2” 

(g) /i(n + 1) - 2/i in) = 0, /<(«> = c 2"+‘ 

(h) >(n + 2) - 3>(n + 1) + 2y(n) = 0, y(n) = a + b 2” 

3 The difference equation [A>(n)] 2 4- [y(n)] 2 — — 1 has no solution Why 0 
4. Verify in each case that the given function is a solution to the difference 
equation together with the indicated initial conditions 

(a) y(i) = 4y(i - i) + 3, y(0) - 5 
(Solution y(l) = 6 4‘ - 1) 

(b) y(r + 2) - y(o, y(o> = o, y<i) = i 
(S olution y(r) - |[1 - (— l)'l) 

5 (a) Show that the function fin) = (ci -f C2n)2 n is a solution of the 

equation fin 4 2) — 4/(« 4 1) 4 A fin) = 0, regardless of the 
values assigned to the constants c i and c 2 
(b) Find the particular solution (that is, the particular choice of ci and 
C 2 ) which satisfies the initial conditions /( 0) = 2 and /( 1) = 12 

6 Find the function sir) which satisfies the following conditions 

(a) git + 2) - 2g (I + 1) + g(r) = 0, g(0) - 1, g(2) = 4 

(b) sit + 3) - 3g(r + 2) + 3g(r + 1) - g(r) - 0, 
g(0) = 10, g(l) =. 20, g(2) = 40 

7 Prove that the difference equation fin 41) — It fin) - 0 has the 
solution fin) = c X", where c is a constant, by 

(a) induction 

(b) rewriting the equation in the form 

/<" + 1) fM „ 

X-« xif “° 

8. Given the economic model 

y(r + 1) - C(r + 1) - lit + l) = 0 

C(t + 1) - bY(t ) = a (a > 0 and 0 < b < 1) 

lit + 1) - lit ) = 0 

with known initial values y(0), C(0), and /(0) 

(a) Show that the difference equations 
o) Yo + 1) - bYit) = a + m 
(ii) Cit + 1) - bCit) = a + 6/(0) 
follow from the model 


374 / DIFFERENCE EQUATIONS 



(b) Verify that 


YU) = b‘ ( HO) - 

c W -**(a»-£^) + S+^ 

satisfy the respective difference equations. 

(c) Verify that Y(t) and C(t) satisfy the economic model. 


13.5 GENERAL LINEAR DIFFERENCE EQUATIONS 
WITH RIGHT-HAND SIDE ZERO 

Thus far we have learned only how to solve linear difference equations which 
involve iterations of the operator A itself. Theorem 6 summarizes our efforts 
and at the same time provides the basis for solving more complicated 
equations. Recall that (A ~ a)/(«) is short for A/(u) — af (n) and that 
(A — a) k f(ii) indicates that the operator A — a is to be applied k successive 
times. The operators A — a and A are related as follows. 

Theorem 7 For all positive integers n and k, 

A*[(l + «)-”/(»)] = (1 + a) - tn+i) [(A - af/m (22) 

PROOF Since the expression (22) is somewhat complicated, let us look 
first at the situation when k — I. We wish to compute the first difference 
of (l + a)~ n f (n). This is 

A[(l + fl)—/(n)] = (l + „)-<»+*>/(„ + 1) - (1 + «)-"/(«) 

= (I + ar"‘ + "[/(n +!)-((+ a)fm 

= (1 + <,)-<»+ »[/(« + 1 )-/(«) - fl/(«)] 

= (1 + o)— <»+>>[A/(„) _ afm 
= (1 + ar<" +, ’[(A - a)f («)] (23) 

For k — 1, then. Equation (22) is valid. 

The proof may be completed by induction. If (22) is true for k = r , then 
ha\e the induetbe assumption 

A'[(I + <!)-”/(«)] = 0 + a)-"’ +,> [(A - ar/(n ) ] (24) 

It follows that 

A r+1 ((l + <•)-/(„)] = A'-i'Kl + <>)-/<«)]} 

= A((l + ar I-+ "(A - a)’f(n) ) [by (24)) 

- (1 + u)-<-+'+”I(A - fl)'+ '/(»)) If, cm (23)) 

and this completes the proof. ► 


1X5 LINEAR DIFFERENCE EQUATIONS / 775 



The principal worth of Theorem 7 is that it enables us to find solutions to 
all equations of the form (A - aff(n) = 0 

Theorem 8 Let k be any positive integer and a p 5 — 1 A function / is a 
solution of the equation (A — n) k /(«) = 0 if and only lf/has the form 

/(/i) = [co -f c in + + c J fc_iii L ~ 1 ](l + a) n 

where c„, c u , Ci-i are constants (Theorem 6 is the special case a - 0 ) 

PROOF Since (1 + a)~'- n+k) is not zero, the equation (A - aff(n ) = 0 is 
equivalent to (1 + nr ln+l> (A - a)‘/(u) = 0 But, by Theorem 7, this latter 
expression is the same as A‘[(l + dy'fiti)] = 0 Applying Theorem 6 gives 

(1 + n) _ "/(n) = c„ + Cill + + c k _xn k ~' 

from which the conclusion of the theorem follows upon multiplying both 
sides by (1 + a) n ► 

The restriction a ^ —l in Theorem 8 is actually no loss of information 
For in this case, (A — a)f(n) = 0 becomes 

(A + l)f(p) «/(n + 1) -/(/,) +/(«) =/(» + 1) = 0 

and thus the solution is already known Similar comments apply to 
(A + \) k f (n) when k > \ 

Example 1 Harrod s one country model of economic growth (Example 5, 
Section 13 l) leads to the equation 

Y(t + 1) - 7(0 = C Y{t + 1) 

g 

where c and g are constants and 7(0 denotes national income during time 
period t Rewriting this equation in the form 

y« + i) = y(<) 

ami then subtracting Y(l) to form a first difference yields the equation 

Td + o - no = no - no 

or 

(*~)no-o 

According to Theorem 8, any solution of this equation looks like 

y (') = “(' + —) =“(—)' 
where a is a constant Since K(0) = alg/(g - c)]° = a , the solulion is 

no= ► 

376 / DIFFERENCE EQUATIONS 



Example 2 In developing his multiplier -acceleration pi maple , Samuelson* 
assumes that national income T(t) at time t is the sum of estimated consumer 
demand C(f) and estimated investment demand I(t) He further assumes that 
consumer demand is given by 

C(t)=kY(t- 1) (25) 

where k is the multiphei constant , and investment demand by 

J(r) = A[Y(t - 1) - Y(t - 2)] (26) 

where A is the acceleration constant Putting (25) and (26) into the original 
equation y(f) = C(t) 4* /(0 yields the backward difference equation 

Y ( /) = (k - f A)Y(t -\)-AY(t- 2) 

The equivalent (forward) difference equation is 

Y(t 4- 2) = (k + A)Y{t + l) - A Y(t) (27) 

Assume for purposes of illustration that k — % and A = 2] Then 
Equation (27) is 

Y(t + 2) = 3 Y(t-h 1) - |7(/) 

Since this equation involves values of Y at three consecutive time points, 
it is a second-order difference equation To put it into difference notation, 
we begin by adding and subtracting terms to form a second difference, as in 
Equation (4) of Section 13 1 This gives 

Y(t 4 2) - 2Y(t 4 1) 4 Y(r) = Y(t 4 1) - | Y(t) 
or 

a 2 no* yo 4 i) - f no 

Next we add and subtract Y(t) to incorporate Y(t 4 1) into a first 
difference The equation now looks like 

a 2 no - yu 4 0 - yu) - 4 no 

or, in operator notation, 

A 2 Y(t)- AT(/)4 \Y(t) = 0 
Factoring the operator goes 

(A - i) 3 T(/) = 0 

the solution of which, by Theorem 8, is 

YU) - (c 0 4 c, /)(?)' 

In order to illustrate how specific values may be obtained for the constants 
c 0 and cj, kt us suppose th it Y values of 2 and 3 5 are obMirvcd at times 

•Cilcd m llauniol, H i txattomic Dy/unuct (Mjcniitbn Compjn), New York, 1 9}9), 
2n4cJ 


135 LINEAR DIFFERENCE EQUATIONS / 377 



The principal worth of Theorem 7 is that it enables us to find solutions to 
all equations of the form (A — off in) = 0 

Theorem 8 Let A. be any positive integer and a ^ - 1 A function / is a 
solution of the equation (A - off (/i) = 0 if and only if/ has the form 

/(») = [c 0 + C& + + a-m l_1 ](l + «) n 

where c 0 , Ci, , c*_i are constants (Theorem 6 is the special case a = 0 ) 

PROOF Since (1 + fl) -(n+t) is not zero, the equation (A — a) k f(n) = 0 is 
equivalent to (1 + a) _tn+ °(A - af/Qi) = 0 But, by Theorem 7, this latter 
expression is the same as A*[(l + a)~'f(n)\ = 0 Applying Theorem 6 gives 

(l + a)~”f(n ) = c 0 + Ci/i + 4- 

from which the conclusion of the theorem follows upon multiplying both 
sides by (l + a)" ► 

The restriction a — 1 in Theorem 8 is actually no loss of information 
For in this case, (A — a)f(n) = 0 becomes 

(A + 1 )/(n) = f{n + 1) - f(n) + /(/») = /(» + 1) = 0 

and thus the solution is already known Similar comments apply to 
(A + I)*/(n) when k > 1 

Example 1 Harrod s one-country model of economic growth (Example 5, 
Section 13 1) leads to the equation 

n< + 1) - no = - ro + 1) 

S 

where c and g are constants and T(r) denotes national income during time 
period t Rewriting this equation m the form 

yo + o = no 

g - c v ’ 

and then subtracting Y(t) to form a first difference yields the equation 

r < ' + » - no - ~- c no - no 

or 

( A _ i^) no = o 

According to Theorem 8, any solution of this equation looks like 

no ^( 1 + j^)-(j 4 -)' 

where a is a constant Since 7(0) = a[g/fe - c)]° = a, the solution is 

no- ► 

376 / DIFFERENCE EQUATIONS 



Example 2 In developing his multiphei -acceleration principle, Samuelson* 
assumes that national income 7(0 at time i is the sum of estimated consumer 
demand C(t) and estimated investment demand 7(0 He further assumes that 
consumer demand is given by 

CO) = kYO ~ 1) (25) 

where k is the multiplier constant, and investment demand by 

7(/) = A[Y(t - 1) - YO - 2)] (26) 

where A is the acceleration constant Putting (25) and (26) into the original 
equation 7(0 = C(0 + 7(0 yields the backnaid difference equation 

7(0 =(k + A)Y0 - \)~AY0~ 2) 

The equivalent (forward) difference equation is 

YO + 2) = (k A- A)Y0 + 1)- A YO) (27) 

Assume for purposes of illustration that k = £ and A = 2\ Then 
Equation (27) is 

Y0 + 2)= 3 7(/ + 1) — 1 7(0 

Since this equation involves values of 7 at three consecutive time points, 
it is a second-order difference equation To put it into difference notation, 
we begin by adding and subtracting terms to form a second difference, as in 
Equation (4) of Section 13 1 This gives 

Y0 + 2) - 27(r + 1) + 7(0 = Y0 + 1) - 1 7(0 
or 

A 2 7(0 - Y0 + 1) - J7( 0 

Next we add and subtract 7(0 to incorporate Y0 + 1) into a first 
difference The equation now looks like 

A 2 7(0 = Y0 + 1) - 7(0 - i 7(0 
or, in operator notation, 

A 2 7(0 - A 7(0+ 17(0 = 0 
Factoring the operator gives 

(A - 5)* 7(0 = 0 

the solution of which, by Theorem 8, is 

7(0 - (c 0 + c it )QY 

In order to illustrate how specific values may be obtained for the constants 
c 0 and ci, let us suppose that 7 values of 2 and 3 5 arc observed at times 

* Cited m Uaumol, VV J , Economic Dynamics (Macmillan Compan>, New Yo;L, 1959) 
2nd cd 


135 LINEAR DIFFERENCE EQUATIONS / 377 



( = 0 and I = 1, respectively Then 

2 = 7(0) = [c 0 + Ci(0)]®° = c 0 
3 5 = y(l) = [c 0 + Ci(l)]® 1 = 5(c 0 + ft) 

From this we find c 0 = 2 and Ci = i, so that the solution of (27) is 

n«) = (2 + 40(f)' 

Obviously, 7(/) — > co as / — » co 


With the information now at hand it is but a small step to the solution of 
complicated equations involving mixtures of operators Let us look at the 
simplest case 

(A - a) (A - b)f (/i) = 0 (28) 

where a b and neither a nor b is equal to —1, in order to see how the 
general case goes 

First, substituting g(/i) = (A — b)f(ji) reduces the order and gives 
(A — n)g(/i) = 0 for which we already know the solution 


gin) = (A - b)f{n) = c(l + a) n 


where c is some constant We now use Theorem 7 to replace (A — b)f (n) by 


obtaining 


(I + 6) n+1 A[(l + 6)-“/(«)] 

(i + ii)” +i A[(i + brvm = c(i + a) n 


Dividing through by (1 + 6) n+ 1 gives 


At(l + b)~ n m] = 



dx n 


where d — c/(l + b) and x — (1 + a)/(l + b) are constants 
The results of Example 3 in Section 13 3 show that dx n /(x - 1) is a 
function whose first difference is dx n It follows (Theorem 4) that we must 
have 


(1 + ^j + q 


where q is another constant That is, letting p = d/(x ~ 1), we obtain 

or (i + *r”/(«) = p[[±|]“ + s 

/(«) = />(! + a) n + qi 1 + b) n (29) 

for some constants p and q To sum up, we have shown that every solution 
of (28) looks like (29) 


378 / DIFFERENCE EQUATIONS 



On the other hand, no matter what values of p and q may be chosen, 
(29) is a solution of (28) For, using the basic properties of differences, we 
find 


(A - a)( A - b)[p{ 1 + ay + qil + b) n ] 

- (A - 6)( A - a)[p ( I + a) n ] + (A - «)(A - 6)fe(l + 6)*] 

= p(A — 6)[(A - tf)(l + tf) n ] + <?(A - a)[(A - b)(l + 6) n ] 

= pi A - 6)[0] + q ( A - a)[0] 

= 0 + 0=0 

Now that we have gone through all the formalities, let us see how our job 
can be simplified Comparing (28) and (29) we note that each factor of the 
operator produces a corresponding term in the solution This is no accident, 
as the following reasoning shows Suppose /; is a function which makes 
(A - b)h(ti) = 0 Then 

(A - d)[(A - b)h{n)} = (A - a)( 0) = 0 

and it follows that h is a solution of the original equation (28) Similarly, if 
a function g is sucn that (A — o)g(/i) ~ 0, then 

(A - a) (A - b)g(n) = (A — b)[(A ~ a)g(n)] = (A - b)( 0) = 0 
and g, too, is a solution of (28) 

We know already that (A — a)g(ji) = 0 has (1 + o) n as a solution and 
that (1 + by is a solution of (A — b)b(/t) = 0 Combining our results, we 
see that for any constants/7 and <y, the function f (n) = p(l + a) n + q(l + b) H 
is a solution of (28) 

Example 3 Let us suppose in the multiplier-acceleration model of Exam- 
ple 2 that A = 2 3 and k ~ § and that the initial conditions are T(0) = 300 
and T(l) = 450 Then the general equation (27) becomes 

Y(t + 2) = &Y(l + 1) - §T(/) 

which may be rearranged to read 

Y(t + 2) - 2y(/ T i) + y(o - ^[y(/ + i) - Y(t)] + }y(/) = o 
or, in operator notation. 


(A 2 - + i)y(r) = 0 

Multipl) mg by 3 and factoring jields the equation 
(3A - 1)(A - l)y</) = 0 

13 5 LINEAR DIFFERENCE EQUATIONS / 379 



for which the solution is 

no = <o(i + 4 )' + c,o + 1)‘ = ‘o ® 1 + ci(2)‘ 

The initial conditions 

Y(0) = 300 = c 0 + ci and Y(l) = 450 — s c o + 2c i 

r(0 - 225(])' + 75(2)‘ ► 

It should be apparent that the approach outlined above generalizes to 
any problem in which the operator can be factored. (A. general proof using 
induction is possible, but lengthy, and will be omitted.) To cite just two 
further examples, the equation 

(A - 2)(A + 3) 3 / (it) = 0 

has the solution 

f(n) = o(l + 2)” + 6(1 - 3)” + c/i(l - 3)” + <* 2 ( 1 - 3)” 
while all solutions of 

(A + 2) 2 (A - 7)(A - {)</(„) = 0 

look like 

(a + 6n)(l - 2)" + c(l + 7>” + {d+ on + gn 2 + /m 3 )(l + 1)“ 

(The letters a, b, c, d, e , g, and h denote constants ) 

PROBLEMS 

1. Lei S’ o, S 1 , Si, be a sequence of objective stimulus values in which 

AS, = S, + i — S, represents a just noticeable change m stimulus inten- 
sity If R, is a subjective psychological response to S„ the Weber-Fechner 
law* states that A R, = c(AS,/S,), where c is a constant Show that if 
AR, = 1 for all /, then 5, = (I + a)'S 0 , where a = 1/c 

In Problems 2-6, find the solution of the indicated difference equation 
which satisfies the given initial conditions 

2. Y(t + 2) = 4K(r + 1) - 4P(r), K(0) = 10, Y(l) = 12 

3. /(/ + 2) = -2 f(t + 1) - /(0,/(0) = 2,/(l) = 8 

4. g(/i) = 4 g(n - 1) - 3g(n - 2), g( 0) = 2, s(l) = 4 

5. h(n + 1) = 5 h{n) — 6h(n — 1), /i(0) = -9, /i(l) = -12 

6. Y(t + 2) = 2 Y(t), K(0) = 3, Y(l) = 8 

•Guilford, J P Psychometric Methods (McGraw-Hill Book Company, New York, 1954), 
2nd ed , pp 37-42 


380 / DIFFERENCE EQUATIONS 



7. Show that if Harrod’s model is modified so that savings Sit) is assumed 
to be proportional to income Y(t + 1) in the next time period, then 
income Y(t) satisfies the difference equation 

Y(t + 2) - (g/c)[Y(t + 1) - y(/)] 

Solve this equation for the case when y(0) = 1, y(l) = 5, and g = 4c 

8. In Example 1, argue that the income sequence K(t) is 

(0 monotone increasing and unbounded when 0 < c < g 
(n) oscillatory and unbounded when g < c < 2g 
(m) oscillatory but convergent to zero when c > 2g 

(iv) constant when c = 0 

(v) oscillatory and bounded when c = 2g 

In Problems 9-13, solve the indicated equation and discuss the limiting 
behavior of the solution 

9. Y(t + 1) = 2Y(t), Y(0) = 3 

10. f(t + 1) 4- 3/(/> = 0, /( 0) = 1 

11. gin + 1) - IgOO = 0, g(0) « 4 

12. f{n + 1) + \fin) = 0, /(0) = 5 

13. Y{n + 1) + 4 Y{n) = 0, r(0) = -1 

14. Solve the multiplier-acceleration equation (27) m the case k = l, 
A = 4, y(0) = 2, and P(l) = 5 What happens as t — > co 

15. Show that the linear difference equation 

Yin + 2) - aY(n + 1) + &Y{n) = 0 
has a real solution as long as /3 < a 2 / 4 

16. (a) Show that all solutions of the equation 

Y(n + 2) - a.Y{n + 1) + PY{„) = 0 

have the form y(/z) = cjr" C 2 r 2 , where iq — 1 and r> — 1 are 
the roots of the quadratic equation jc 2 + (2 — a)v + (l 4-/3 — a) 

— 0, provided these roots are unequal 

(b) Prove that if Jn| > |/- 2 | and cj ^ 0, then 

Iim Yin) — lim cjr" 

In particular, 

(t) Yin) is unbounded if rj > I 

(u) P(/i) converges to zero if —1 < ri <1 
(m) y(n) oscillates without bound if ri < — 1 
(iv) Y{n) converges to cj if rj = 1 

(v) Yin) oscillates but is bounded if « — 1 

135 LINEAR DIFFERENCE EQUATIONS / 381 



13.6 METHOD OF COMPLEMENTARY AND 
PARTICULAR SOLUTIONS 

Having determined, at least in principle, a method for solving difierence 
equations of the form 

mm = 0 <30) 

let us now consider equations like 

tf(A)/(n) = a(ii) I 3 ') 

where a(it ) is some given function of n 
Suppose that we have somehow found a solution gin) of (31) and suppose 
that /(«) is any other solution Then since ^(A)g(/j) — ( H ) 85 fl 00» 

we have 

<#>(A)l fin) - gin ) ) = <HA)/(/j) - 4>(A)g(/i) = a{n) - «(») = 0 

This means that the difference f{n) — gin) is a solution of (30) and hence 
that every solution fin) of (31) can be written as the sum of the particular 
solution gin) and a solution fin) — gin) of (30) In short, to find all solutions 
of (31), we need find only one solution of that equation to which we add all 
solutions of (30) 

Standard terminology calls (30) the equation complementary to (31) and 
any solution of (30) is called a complementary solution of (31) The problem 
of finding all solutions of ^(A)/(n) = a 00 has been reduced to the problem 
of finding any particular solution together with that of finding all comple- 
mentary solutions Since we know how to solve (30), our only problem (in 
theory at least) is that of finding a particular solution 

Example 1 To solve the equation 

-A 2 fin) + 2A/(«) = n (32) 

we first solve the complementary equation A (A — 2 )/(/i) = 0 to obtain 
complementary solutions of the form C(n) = a + b 3 n , where a and b 
are constants 

To find a particular solution of (32),* we ask what function of n is such 
that combinations of its first two differences would yield it Since differences 
of polynomials are polynomials of lower degree and since differencing has 
produced the first-degree polynomial n, it seems reasonable to guess that 
we began with a third-degree polynomial of the form 

Pin) = c 0 + c pi + c 2 n 2 + c 3 n z (33) 

If (33) is to be a solution of (32), we must have — A 2 />(/j) -f 2Ap(/i) = n or 

— A 2 (c 0 + ci/i + c 2 n 2 + C 3 H 3 ) + 2A(c 0 + Ci>i + c 2 n 2 + c 3 ;j 3 ) = it 

♦Note that we still have no solutions of (32) since the so called complementary solutions 
are actually solutions of — A*/(«) + 2 A/(«) = 0 and are not solutions of (32) itself 


382 / DIFFERENCE EQUATIONS 



Writing the indicated differences and then simplifying leads to the equation 
2c i — 4 c 3 + 4c 2 n + 6c 3 n 2 = n To make this an identity in n , we need 
c 3 *= 0, c 2 = 5, and Cx = 0 No restriction is placed on c 0) and we 
arbitrarily choose it to be zero 

With these choices, our particular solution is p(n) — \n 2 which, together 
with the complementary solutions already obtained, means that all solutions 
of (32) look like 

/(") = in 2 + « + b 3” 

where a and b are constants ► 


In guessing at the form of a particular solution, we found that the terms 
c 0, Ci«, and c 3 /i 3 were superfluous At the same time, we ran the risk of 
guessing incorrectly and consequently of finding no solution at all This is 
an unsatisfactory state of affairs but is easily remedied, as follows 
Suppose/(/i) is a solution of (32), that is, suppose —A 2 /(;i) 4- 2A/(«) = ;; 
Since identical functions have identical differences, it follows that 

A 2 [-A 2 /(») + 2A/(,i)] = A 2 n = 0 

and hence every solution of (32) is also a solution of this new equation 
Factoring the operator gives 

A 3 (A - 2)/(«) = 0 

so that 

/(«) = a + bn 4- at 2 4* d 3” 

is the form which every solution of (32) must have 
The terms a and d 3 n being part of the complementary solution, only 
bn 4- c/1 2 is necessary for a particular solution Substituting this in (32) 
gives 6 = 0 and c = 1 so that the complete solution is 

fOO = 4 ■« + (/ 3 n 

as before 

The key to tins approach is that in choosing the operator A 2 in order to 
annihilate (reduce to zero) the right-hand side, we arrive at an equation 
whose solution form is known All guesswork concerning the form of a 
particular solution is thus eliminated 

Exampto 2 Using Equations (4) and (5) of Section 13 1 and proceeding 
as in Example 2 of Section 13 5 we cjn write the equation 

>(« 4- 3) - 6>(/j 4* 2) 4- 3>(/i 4- 1) 4- J0>(«) « Sir + An (34) 

in operator form as 

(A ~ 4)(A 4- 2)(A - !)>(//) ~ S/1 2 4* 4,i 

136 COMP LEMENTAfl V SOLUTIONS / 333 



The operator A 3 annihilates the right hand side Thus the complete solution 
has the form 

X«) = c 0 + ctf + c 2 n 2 + c 3 5" + c 4 (-\) n + 2" 

in which the constants c 0 , c lt and c 2 must be chosen to provide a particular 
solution, while the last three terms constitute the complementary solution 
Substituting the particular solution c 0 + c t /i + c 2 n 2 into the original 
equation (34) and equating coefficients of like powers of n gives c 0 = 3, 
Ci = 2, and c 2 = 1 The complete solution is 

y(ii) = 3 + 2»i + n 2 + C 3 5” 4* e 4 (— 1)" + c a 2” ^ 

PROBLEMS 

1 Complete the analysis of Example 2 by verifying that we must have 
co — 3, ci — 2, and c 2 — 1 


Example 3 The linear difference equation Y(n + 1) = X Y(n) +■ k , where 
X 7 * 0 and k are constants, may be written in operator form as 

IA - (X - 1)]Y(h) = k 

Since the operator A annihilates k, the complete solution is 7(h) = c + b\ n , 
the term b\ n representing the complementary solution 
To determine the constant c, we require [A — (X — l)]c = k, or 



The solution Y(/i) = [/c/(l — X)] + b\ n contains an arbitrary constant b 
If the initial value 7(0) is known, then we have 7(0) = k/( 1 — X) + b or 
b => 7(0) — k/(l — X) The solution is 

and it is apparent that limiting behavior of the sequence (Y(/i)) depends on 
the term X", the other quantities being constant In particular, since (X n ) 
diverges when |X[ > 1 and when X =*. — l and converges otherwise, we find 
that 

( 1 ) 7(/i) 1S unbounded when X > 1 

(u) 7(n) converges to k/( 1 - X) when 0 < X < 1 

(ui) 700 oscillates, but converges to k /{ I - x) when - 1 < X < 0 
(iv) 7(«) oscillates between 7(0) and 2/c/(l - \) - Y{ 0) when X = - 1 

(v) 7 (h) oscillates and is unbounded when X < — 1 

The limit hm„_ x 7 (h) = k/(l — x), when it exists (that is, when 
|X| < 1), is called the equilibrium state of the solution, while the term 
(7(0) - k/{ 1 - X)]X n represents the tiansient part of the solution ► 


384 j DtFFERENCE EQUATIONS 



Example 4 The cobweb model introduced in Example 3 of Section 5 2 
generates the linear difference equation 



whose solution is 


S(l) 


be — ad 


+ S(0) 


(35) 


, * h 0 and d > 0, the sequence (SO)) 1S always 

Since by assumption, b < 0 and a > , when d= |j|, an d 

oscillatory, being unbounded when d> I h V < | 4 | 

convergent to the equilibrium state (be - ad)/(b d) when 
(Compare with Problem 4 in Section 5 4 ) p lgure 1 The 

The time path for a convergent c0 ' ve " h lagged linear supply 

linear demand function D(t) = a + bPO) and " e ‘ , 

function S(0 = c + dP( -‘ ~ l) are P '° U “ g P 


FIGURE 1 


D or S 



An initial price 7>(0) = leads to an initial ^mre°."tHW 

supply cur\c (Follow the arrows m Fig ) r|CC pn\ =. />, 

initial demand 0(1) equal S(l), which hads ^ ; p The prlLC />, 
obtained by solung the equation 0(1) 0 + Generates a new 

leads to a new supply S(2) = e + dPu » h ** Sciera ^ + ^ 

demand 0(2) = S( 2). a new price />* obumul by l 8^ of 

for t’., etc The oscillatory nature of the t,nu ' P ^ , curses (the 

^m’whcre'o 1 " T- (L" - ,he B » U,B * 


COMPLEMENTARY SO 


13 6 


LUTIONS / 335 


PROBLEMS 

2. Verify that (35) is indeed the correct solution to the cobweb equation 

3. Show that the equilibrium price is (c - a)/{b — d) 

Example 5 According to Theorem 8, the operator (A — a) k serves as an 
anmhilator for n l_1 (l 4 a) n Thus to solve the equation 

Y(t 4 2) - 6 Y{t 4 1) 4 9 7(0 - It + 3* (36) 

or, in operator form, (A - 2 )*Y(t) - 2/ 4 3 4 , we first apply the operator 
A 2 (A - 2) to both sides to obtain A 2 (A - 2) 3 7(0 = 0 The complete 
solutton of this equation has the form 

Y(t) = Cq 4 C\t 4 (C 2 4* ctf 4 c 4 r 2 )3‘ 

Because of the operator (A — 2) 2 in the original equation, the terms 
(C 2 4 Ca03* constitute the complementary solution, leaving 
c 0 4 c x t 4 c 4 / 2 3* 
as the form of the particular solution 

A little computation shows that (A — 2) 2 (c 0 4 c x t 4 c 4 f 2 3'] = 2t 4 3* 
if and only if c 0 = c x = £, and c 4 = ^ Hence the complete solution 
of (36) is 

T(0 - (c 2 + c 3 03‘ 4 * 4 4 */ 2 3‘ ► 

PROBLEMS 

In Problems 4-11, find all solutions of the indicated equation 

4. Y(t + 2) = 5K(r + 1) - 4 HO 4 22 

5. f(n 4 2) - 4 fin 4 1) - 3/(n) + 6 

6. git 4 2) 4 6 g(t + 1) - 7g(0 4 16 = 0 

7. Y{n 4 2) = 2 Yin 4 1) + 3 Yin) 4 2n 2 + 6/« - 3 

8 . 8H/ 4 2)- 6 HI 4 1) + HO = 3' 

9. &(/i 4 2) - 4g(/t 4 D 4 3g(n) = 2” - 3n 

10. h{t 4 2) - 6/i(r 4 1) 4 9/i(i) - t 6' 

11. AY(n 4 2) 4 4H« 4 1) — 3H«) = 2"{n 4 1) 

Solve the equations in Problems 12-15 subject to the given initial conditions 

12. Y(t 4 2) 4 H/ 4 1) - 2 HO - 12, HO) = 8, HD = 10 

13. fit + 2) - 4 fit 4 1) 4 4/(r) = 10, /(0) = l,/(i) = 8 

14. fin 4 2)- 2fin 4 1) - 3/(n) = 12, /( 0) = 12,/(1) = 4 

15. ) (/ + 2) - 41 (i 4 0 4 3 HO = 6, HO) = 1, HD = 2 

16. Prove that the difference equation 

Yit 4 2) - (1 4 0)YO + l) 4 0Y( {) = k 


386 / DIFFERENCE EQUATIONS 



has the solution 

W = c 0 4- ciP 4- kt/(l - 0) 


03, &, Co, and ci are constants ) 

17. Anderson* proposed the difference equation model of response extinction 

R(n 4- 1) = R(n) - 6[R(n) - /?(<*)] 

where R(n ) is the response on extinction trial n, R(cc) is the extinction 
response level at equilibrium, and 0 < 6 < 1 is the extinction rate 
Show that the general solution is 


R(n) = R(c o) - [R(co) ~ - 0)"-i 


18. Given that the demand and supply functions for a commodity are, 
respectively, D(t ) = 10 — 4 P(t) and S(r) — 3P(t — 1) 

(a) deduce a cobweb model for the price of the commodity 

(b) solve the resulting equation 

(c) compute />(/)(/ = 0, 1, 2, 3, 4, 5) if/>(0) = 1 

(d) graph the cobwebs around the demand and supply functions for 

/ = 0, 1, 2, ,5 


19. Modify the Harrod model (see Example 5 of Section 13 1) to the extent 
that total investment during period t is assumed to be 

/(/) = glY(t) - Y(t - 1)] 4- k no 4* L 


where k and L > 0 are constants Given no other changes, 

(a) show that income Y(t) satisfies the difference equation 


YU 4- 1) = 



W4-- 


L 

g - k 


(b) solve the difference equation for Y{i) 

(c) verify that if g and c are both positive and if k < c, the ratio 


Y{t + i) - no 

YU 4- 1) 

must now increase, whereas in the original model this ratio was the 
constant c/g 


20. Consider the nonlinear difference equation 


ci fin 4- 1) = 


Cjfin) 

1 4- c 2 /(«) 


where ci and c 2 are* constants and/(0) > 0 

(a) Show that by making the substitution y(n) «* !//(«), the nonlinear 
equation is transformed into the linear difference equation 

3 0> 4- l ) - “ > 00 4- cj 

•Anderson. N II , * Conipamon of Different Populations Resistance to Extinction and 
Transfer,** Piycfiofogtctd Rccic* 70, 162-179 {196}) 


136 COMPLEMENTARY SOLUTIONS / 337 



(b) Use the transformed equation to find the solution of the original 
equation 

21. (a) Show that if 6 is constant, 

A sin bn = [cos 6-1] sin bn + sin 6 cos bn 
A cos bn = [cos 6-1] cos bn - sin 6 sin bn 
(b) Use the information in part (a) to solve the equation 
Y(n + 2) - Y(n) - sin \ttn 

[Hint Proceeding as in Example 1, guess a particular solution of 
the form ci sin bn + c 2 cos bn, where c 1 and a are constants *] 

22. Suppose that a sum of money A is borrowed at 6% interest compounded 
annually and is retired by a payment of amount R at the end of each 
annual interest period Let Pin) be the outstanding principle just after 
the nth payment of R 

(a) Show that P(n) satisfies the difference equation 


P(n + 1) - (1 + 0 06 )P(n) - R 

(b) Determine the periodic payment R needed to retire the debt in k 
periods 

23. Let S be a set of N community residents At time period t, Nit) residents 
favor and N — N(t) residents oppose a local community issue In each 
time period, 100 0% of those who previously favored the issue and 
100 a% of those who previously opposed the issue change their position 

(a) Show that 

(b) Discuss the limiting behavior of NO) Compare with Problem 20 of 
Section 6 2 


24. In the cobweb model, it was assumed that the market was cleared at 
every time period, that is, 0(0 = S(0 Suppose instead that we assume 

DO) - a - bPO) 

SO) = -c + dPO) 

P0 + 1) = Pi 0 + a[D0) - S(0] 


where a > 0 denotes a demand-induced price adjustment coefficient and 
a, b, c, and d are all positive Show that the model has the time path 


Pit) 


■('»>- Sts) »■ 


„ (4+rf)1 , + |±' 


Special mclhcxls, such as those illustrated in Problems 20 and 21, for solving equations 
not covered by our general discussion, may be found m Goldberg. S , Introduction to 
„ **'“* Uluaratl ™ Examples from Economics, Psychology and Soci- 
ology (John NS ilcy & Sons Inc . New York, 1958) * 


38a / DIFFERENCE EQUATIONS 



25. in relation to Problem 24, discuss the behavior of the time path Pit) 
when 


(l) 0 < a < 

00 a 


b + d 

1 

~ b + d 
1 2 


0>0 < “ < b + d 


(iv) a = ■ 
(v) a > ■ 


2 

b + d 
2 

26. Suppose that ^\ve modify the multiplier-acceleration mode, (Exam- 
ple 2 of Section 13 5) to assume that 

y(/) = CO) + '(') + c 

CO) = y Y(t — i) o<i'<i 

;(,) = u[C«) - CO - “ > 0 

Where C ,s a constant Find the time path I'M 

27. In Problem 26, discuss the stability of the ecu, hbrium ,n each of the 
following cases: 


(a) y > 

(b) y > 

(c) y - 

(d) y - 


4a . 
(1 + a) 2 ’ 
4a 

(1 + a) 2 ’ 
4a . 
(I + a) 2 ’ 
4a 


ay > 1 
a y < 1 

a < I 
a > • 


(1 + a) 2 * 

28. Show that ,f ZO) is a particular solution of 

YU + 2) — aPO + 1) + (IFO) = /("> 

and if IPO) is a particular solution of 

n , + 2 ) - «f(i + 1) + <*«« = i ' ( " ) 

then ZO) + WO is a particular solution of 

y( , + 2) - al-0 + 1) Hmt)-/(") + «<"> 


SUPPLEMENTARY READING 

Goldberg, S , /nma/ucnu/i Inc, New YorL. 

Economics, Psychology ami Sociology k 
I95S) 

SUPPLEMENTARY REAPING / 383 



INTEGRAL CALCULUS 14 


141 AREA 

The area A of the region in Figure 1 bounded by heavy lines is easily computed 
to be 3/2 But what of the area in Figure 2 9 Is there a simple means of 
computing it 9 The answer is “yes” and is provided by the second of the two 
major concepts of calculus — the integral (A mathematically precise definition 
of area is given at the end of Section 14 2 In the present section, however, 
we shall proceed intuitively, using only the familiar definition for area of a 
rectangle ) 

FIGURE 1 


/ 



FIGURE 2 



u H of mi integral, let us look at an 
Before formally introducing t e ‘ “ thod 0 f computing the area in 
alternative and seemingly rounda f r (x) = x + 1 lies between the 

Figure 1 In the interval [0, H the ffaPh °f/W ^ ^ _ 2 Thus the 
graphs of the two constant function J* a reclansIe having height two 
required area A is no greater smaller rectangle with height and 

and base one, and no less than that of a smal^ ^ ^ ^ , < /( < 2 . 
base both equal to unity In s tor , he 0 htjmcd by partitioning 

A better approximation using rcc = , j JS shown in Figure 4 

the interval [0, 1] into two intervals 10. 21 - hJVC rcspccmc areas 

The dotted rectangles in the j'J must hate 

4 an d 3 Since both he entirely within the area n, 

i+ 1 = 'i< A 

. 1 th buses on the horizontal axis ami 

Similarly, if we use the rcc a g «- ^ sccont l approximation, 
dashed tops, we find A < f + » 

} < A < I 

,s a considerable improvement over the initial result 1 < A < 2 

AREA / 391 


14 1 



FIGURE S 


f(x)=x+l- 


r^ (12) 


1 2 k_t t tLJ 


we choose, Jabovefone ml. 

rectangle which circumscribes a par ^ obtained by evaluating 

the height of the smaller rectangle is V- hclg ht of the larger 

m = * + > at the left endp °‘" ^“luatmg / at the right cndpoin, k/n 
rectangle is k/n + 1. obtame J ,, of , he ar eas and summing, 

Each rectangle has width l/» Computing 
we find 

l[i + ( i + i) + (, + 2)+ + ( 1 + "^')| 

<,<i[(l + 9 + ( I+ n) + + ( 1 + «)1 

or, in summation notation, ^ 

Since & 1 - :0. + »/T(from Problem 6 of Section 5 2), this Ration 
may be rewritten as 

it L n J 


or, on simplifying. 


l-^<A<l+h 


HI AnEA / 333 



FIGURE 3 



The pattern having been established, let us proceed to the general case 
We partition the interval [0, 1] into n smaller intervals [0, 1 /«], [1/u, 2/n], 
, t(n — 1)/ n, 1] each of length 1/n (See Figure 5 ) In each smaller interval 


392 / INTEGRAL CALCULUS 



The area A *. which lies under the curve m the Ath interval thus satisfies 
the inequality 

if 4 - A - >i [ 4 ~ J ? {k ' ~ 2k + 1} ] 

so that the total area A = An is restricted by 

&i [ 4 - Sj * M 4 - I <* 2 ~ 2 * + '>] (2) 

The sums in (2) may be simplified using the formulas 
1 1 -» 

£ k = 
u»i ^ 

V' >2 n(« -H)(2// H- 1) 

£T “ 6 

to give 

8 g «Qt + 0(2 « +n <ai 

;i 3 6 "" 

< _ 8 r H (n 4- l)(2n -f 1) _ 2n(;« + D , 1 

n 3 L 6 2 J 

As it — * qo both the left and right sides of this inequality converge to 
8 — 8/3 = 16/3 It follows that A = 16/3 and that the entire area in 
Figure 2 is 2/1 = 32/3 Thus we see again that our scheme for computing 
areas does produce reasonable results 

problems 

In each of the following problems, use ihe method outlined in this section to 
find the area which lies below the graph of the given function and above the 
indicated interval [u, A] on the horizontal axis 

1. f ( x ) = a a « 0, A = I 

2. g(*) = 3r a ~ 0, b — I 

3. /i(r) = 2 a and A arbitrary, except that a < 6 

4. g(r) — /- a ~ —2, b = l 

5. A(r) = -r f 1 a » —2, A = I 

6. m(f) - r 2 — t -f 1 a = —2, A = I 

7. /(c) ® c -f 2 a ® 0, b - ! 

a /(c) = : + 2 tf ® 1, A «• 2 

9- /(c) = r + 2 ^ - 0, A - 2 

10. La c be a positive constant and / be a function which is non negative 
throughout the intervut [a, 6] Formulate a theorem which relates the 


14 1 AREA / 335 



FIGURE 7 



Example 1 The funct.cn h defined m the interval [1, 3] by 

/,(/) = 1 + 4 / - < 2 = 5 " 0 " 2) ' 

■ , nl prvnl into four smaller intervals each of 

is shown in Figure 8 Cutting the 
width we obtain the upper sum 

Mi = im + i' 1 ® + + vm 

= JL(i^ + 5 + 5 + -^) = ^ 


FIGURE 8 
h 




area below / with that below c f, both areas lying above the interval 
[a, b] on the horizontal axis (Hint Compare Problems 1 and 2 ) 

11 . Let / and g be two functions each non-negative throughout the interval 
\a, b ] Formulate a theorem relating the area under f + g with those 
under f and under g Again each area lies above [a, 6] {Hint Compare 
the results of Problems 4-6 ) 

12. Let / be a function non negative throughout the interval [a, 6] and let c 
be any number between a and b Formulate a theorem which relates the 
area below / and above [a, 6] with those below / and above [o, c ] and 
[c, 61, respectively ( Hint Compare the results of Problems 7-9 ) 

13. Using the method for computing area indicated in this section, prove the 
statements formulated in Problems 10-1 2 Give a geometric interpretation 
of each result 

14. Let functions / and g be such that f(x ) > g(x) throughout the interval 
[o, 6J Formulate a method for computing the area in the interval [a, b\ 
which lies between f and g Use this method to find the area in the 
interval from a =* 0 to b - 1 which lies between the graphs of / (x) = 3x 
andg(v) = x 2 — x 


14 2 THE INTEGRAL 

The method for computing areas introduced in Section 14 1 is closely related 
to the concept of the integral of a function However, since integrals have 
wide application in situations where an area interpretation is not meaningful, 
we shall develop a definition of the integral which does not depend on any 
knowledge of area Unless stated otherwise, all functions considered in this 
chapter arc assumed to be continuous 

The integral of a funclion/over an interval [a, b ] is denoted by the various 
notations ^ J 

f„ f or f a /M or, most orten, J f(x) dx 

S ,'? r'l'T n° n ° U, :r Wc be S'"- as w ‘ th ^as, by part.tiomng 
F Jr" 71 F 1 - ‘, 7 " Smj cr m,cr ' als of equal width (b - a)/n (see 
Figure 7) For k — I , .. , n, wc choose in the Alh interval 


n n 


‘In’T, W, "' h : hC , C r, mU0US funcllon f takes its largest value and 
(. poin }ia n ic /t iLcs its smallest \aluc Wc then form the upper 


• t -~/(vr> = 




+ ——/M 


(3) 


396 / INTEGRAL CALCULUS 



rectangle is equal to or larger than the area under the graph of / The sum 
M n thus represents an upper bound to the area under the curve between 
a and b Similar comments apply to show that m n is a lower bound for this 
same area Since, when the integral exists, M n and m n converge to the same 
limit, we see that the two concepts area and integral are equivalent when the 
function is positive throughout the interval 

Example 4 Suppose the function g is defined on the interval [—1, 2] by 
g(x) « x 3 If we divide the interval into six smaller intervals, each of length 
we find (see Figure 9) 

1*6 - K-0* + K-l) 3 + 5(0)* + i(fi* + MD 3 + MI) 3 

— 22 
— 16 

If an interpretation in terms of area is attempted here, we notice that 
some rectangles in Figure 9 he below the horizontal axis The areas of these 


FIGURE 9 

8 



142 THE INTEGRAL / 333 



and the lower sum 

m, = im + ¥'&) + 1 *© + h'W 
= 4(4 + V+¥ + *) = ¥ 


► 


Example 2 When the interval [0, 1] is partitioned into smaller intervals 
each of width l/n, the upper and lower sums for the function fix) = X + 1 
are (see Figure 1 and Equation (1) in Section 14 1) 

3 1 k 


M. 


,1 + L 

2^ 2n 


and m„ — ; 


' 2n 


If the sequences of upper and lower sums, 

( M„ ) = (Mj, M 2 , M 3 , ) 

and 

(w n ) = 0«i. m 2 , w 3 , ) 

determined by Equations (3) and (4), respectively, both converge and have 
a common limit 

I = lim ( M n ) = lim ( m n ) 

this limit is called the integral of f on the interval [a, b] and we write 


■t 


fix) dx 


(S) 


Example 3 In Example 2 the quantities M n and m n are given for the 
function f{x) = x + 1 on the interval [0, 1] by 


M n = ^ ^ and m n 


3 1_ 

2 In 


The sequences (A/„) and (w n ) converge to the common limit 2 Thus the 
integral for this function is 


J o fix) dx = (x + 1) dx = % 

Of course, this is the same answer as the area A found in Figure 1 


If/ is a positive function throughout the interval [a, b], the sums (3) and 
(4) have area interpretations For M n , for instance, the product 


-fM 


represents the area of a rectangle having width ( b — a)/n and height /(**) 
(see Figure 7) Since in the klh interval, /is largest at xa, the area of this 


39a / INTEGRAL CALCULUS 



each x, to be a rational number and each n to be irrational m Equations 
(3) and (4), we obtain the sums 

M, = E y, /M = p J = 1 

Je=l 

m. = t 7/^) = t 0 = 0 

The two sequences converge, but not to the same limit ► 


The following definition summarizes our discussion and introduces some 
useful terminology 

. * + r a continuous real-valued function 

Definition of Integral Lei If ^ B called the definite 

defined throughout an interv 1 I . 1 ^ comm on limit of the sequences 

mtegtal off on the interval [a, b] function/is called the 

(M„) and (m„) defined m Equations ( 3 ) and (4) 1 J The 

integrand, and if the integral exists, / is said to mteg 
number I is denoted by any of the notations 


i=ff, / o /W. or [ fix)dx 


The numbers a and * are called, respectively, the toner an d upper him, 
of integration 

Although the computations and ^ot° that the concept of 

were concerned with area, ca %er 8 defined Although area provided 

area itself remained intuitive an the definition we carefully 

apnmemoUvrnKmfortted^rf^J^ of m ,egral , ndc 

penitent of^area^nTs novT logically permissible to define area ,n terms of 
integral 

. , rhn 1 function which is non negative throughout 
U,:'“ “hen ^urcu A of the region shaded in Figure ..(a) , 


A = l md' 

, 61 . till 


.f/(v)> g (v)fora..v S[a ,H then the area of the region shaded ,n 

Figure 11(b) is 


14 2 THE IN 


TEGRAL / 401 



each Xi. to be a rational number and each y t to be irrational in Equations 
(3) and (4), we obtain the sums 

Af„ = E /(*0 = S J = 1 

r_i " i-i ' 

,»,= i'/w= e° = ° 

r-i " i-i 

The two sequences converge, but not to the same limit ► 


The following definition summarizes our discussion and introduces some 
useful terminology 

Definition of Integral Let / be a continuous real-valued function 
defined throughout an interval [a, 6] The number / is called l the idefnue 
integral of f on the mtenal [a, b] if / is the common lmu of the sequenc« 
(M„) and On,,) defined in Equations (3) and (4) The function / is called^he 
Integrand, and if the integral exists, / is said to be m, eg, able on [a, 4] The 
number I is denoted by any of the notations 

/=J /, l f(x), or jj(.x)dx 

The numbers a and 4 are called, respectively, the latter and upper hnun 
of integration 

Although the computations and ^“g'wiU Ihowdlat 
were concerned with area, careful rea 8 Although area provided 

area itself remained intuitive and was ; ne d ^ ^ dcfmmon wc careful | y 

a prime motivation for the definition o S , f m tegral inde- 

avoided any mention of area Having made ‘he concep y of 

pendent of area, it is now logically permissible to define 
integral 

o, l«/«. - 

the interval [a, b] Then the area A of the g 


I = ^ /(Odv 


r / 1 then the area of the region shaded ill 
If /(V) > gtx) for all v S [a, W ,hcn 
Figure 11(b) is 


A = l (/(') - * (v)1 




u2 THE INTEGRAL / 401 

\ 


rectangles are counted negatively or mbttactcd, while the areas of the 
rectangles which lie above the axis arc counted positively or added. ► 


PROBLEMS 

1. For ihe function g defined in Example 4, find A/i and m\ 


Example 5 Suppose / is the constant function 
f{x) = c a< \< b 

Then, regardless of the values of a i, \ 2 , . . , \ n , wc have 

A-l U t-1 “ 

This means that (M,) and (i»„) are constant sequences and that 

j /(x) dx = j c il\ = c[b — n) ^ 

If c is positive, this integral represents the area of the rectangle in Figure 
10(a) On the other hand, if c is negative, then c(b — a) is negative and the 
integral itself cannot be interpreted as area However, the absolute value 
jc(b — a)| represents the area of the region in Figure 10(b) ► 

FIGURE 10 




Example 6 By contrast, the function /defined on [0, 1] by 
if x is rational 
if a: is irrational 

does not have an integral For we know that every interval, no matter how 
small, contains both rational and irrational points Hence, by choosing 



400 / INTEGRAL CALCULUS 



mlegial, of/ and we write 


F(x) 


= j fix) dx 


The next theorem shows that all anUdenvativcs of a gwen function are 

closely related (Compare with Theorem 4 in Section 3 ) 

Theorem 1 If functions F and C have the same derivative throughout an 
interval [a, 6], then there ,s a constant C such that for all A «. fe 
F(x) = G(x) + C 

PROOF Define the function H by H(x) = F(x) - G W Then 
II' (x) = F'(x) - G'(x) = 0 

since the derivatives of F andG are eq f u ^ e ^ ts t J ougho m the 

interval (Xthe ~ guarantees the existence of a point 

c between y and z at which 

H(z) - H(y ) = - y) 

„ „ ,, , = H(v) This being true for all y 

and zf/f must be'a constantfumhion^mid our proof is complete ► 

■v . ni v z Ox- 4- 7) = 2x + 2 means that 
V + 2 + 7 is an antiderivative of /« - 2 a + 2 On the other 
Sr, fa itany other ^husSS 

oft Lm^StV + 7 + C or, since 7 + C 
is itself some constant, are of the form 

G(x ) = x- + 2 x + B 


where B is a constant , , , , ui 1S 

(b) The derivative of G(x) = -cos x + \n x + 

G'(v) = sin x + - + x~ 

r i r „ , n/ x t 4- \ 2 has the form 
Hence every antiderivative of sin \ + U/ v -r 


f( A ) = —cos a + In x + ^ 


+ C 


where C is a constant must bc properly interpreted to mean 

The equals in F(x) J J ■ [q mcjn „ F 1S ,/ w antiderivative of/ 

“F is an antidcrivati'c of/ F thcn , t i 1JS an infinite number. 

For ,f a funct.on/has one anudemal.vc F tltcn ■ 

these being of the form F + C whem C ■> con tan ^ of , |lc 

On the other hand, TTieorcn^guarantecs^lhat^ ^ ^ on( Jn „. 

same function /can timer ai mo / 

143 ANTIDERIVATIVES / 403 


figure 11 



PROBLEMS 

For Problems 2-8, the given function / is defined on the interval [0, 1] Find 
the upper and lower sums M n and m H which are obtained from the partition 
(0, l//i, 2/«, , in — 1 )/n, 1} If possible, simplify your results by apply- 

ing appropriate summation formulas 

2 fix) = x 3. fix) = 2x + 3 4. fix) = £ - x 

5 fix) = x 1 ' 2 6. fix) = x 2 7. fix) = 3x 2 + 2x 

8 fix) = 3x 3 + 2* 2 - x + 6 

9. By taking limits as n — > », find the definite integral of / on [0, 1] m 
Problems 2-4 and 6-8 


10 Find the area of the region between fix) - 
interval [0, 1] 

11 Evaluate these definite integrals 


(a) 3xdx 
(c) j M the 

w i>- 


W) dx 


-- X and g(x) = x 2 over the 

x dx 

(x + ]x[) dx 
(r + r - 1) dx 


14 3 ANTIDERIVATIVES 

If a function / is the derivative of another function F, that is, if F'(x) = fix) 
for all x m the domain of/, then Fis called an antiderivative, or an indefinite 


402 / INTEGRAL CALCU LUS 


J cos x dx = sin x -f- C 
J ’ sin xdx — — cos x + C 
J sec 2 xdx = tan x -f- C 


Example 2 
look like 


(a) Since Vcosx = —sin x, all antiderivatives of 
f(x) — —sin jc 

F(x ) = cos a + C 


where C is a constant 

(b) If <J> has a derivative then 

J e* w <f>'(x) dx = e* w) 

Hence all antiderivatives of f{x) = 2xe x2 look like 
F(x) = e*' + C 


where C is a constant 

(c) All antiderivatives of g(v) = sm3\ - e~ 2x -f a look like 

G(x) = -h cos 3 a- + + ^ + C 

(d) If / has a derivative, then 

D[/(*)r + ‘ - (« + D[/wr/'w 

In terms of antiderivatives, 

/[/wirw* = ,7qrT(/ (v) r +l + c 

Thus, for example, a(y 2 4- 3)~ ,n is the derivative of 
.0r+^ = 3 (r2 + 3) „3 

2 J 4 

Similar reasoning applied to the equation 

Dfi(\) = 3\(y 2 + 3r • 1/3 -f (\ - l)(2x- - 4\ -f I)'' 


>iclds 


A(v) - 


3 (v 2 + 2 (2-t 3 - 4t+ l) T 

2 3 _r 4 7 + 

;<y + 3) 3;j + j’.cix 2 if+ c 


► 


14 3 ANTIDERIVATIVES / 405 




derivative is known, then, all are known This equivalence of meaning 
between F(x) = J f(x)dx and F'(x) = f(x ) means that each derivative 
relation listed in Table 2 of Chapter 12 has a counterpart in terms of anti- 
derivatives For easy reference we list the most useful cases in Table 1 A 
great many antiderivatives may be found by direct use of Table 1, together 
with the Chain Rule (Section 12 4) (Techniques for obtaining antiderivatives 
of more complicated functions are discussed m Sections 14 7-14 9) Note 
that absolute value is indicated in lines 6 and 7 This is because if x is positive, 
then D\nx = \/x, while if x is negative, D In (— r) ~ Z>(—x)/—x = J/ v, 
also The formula /(I fx)dx = In \x\ 4- C takes care of both cases 


TABLE 1 A short table of antiderivatives 

1 J OF{x) dx - F{x) -f C 

2 Jodx = C 

3 j" 1 dx = x + C 

/ y +1 
*" * " + c 

s U»r M dx,m^l + c 

6 Jx~' dx - In |«] + c 
— f f’(x) 

a flHx) + SM] dx . jl M dx+ jg{x)dx 
9 |( a is a constant 

10 f + dl'l(x) + +cj„(x)]dx 

- c,jf,(x) dx + c 2 jf M ax+ + c„j Mx ) dx 

' 1 J c x dx = e« C 

12 f a* dx = — f- C 
J In a 


404 / INTEGRAL CALCULUS 



Fundamental Theorem of Calculus If JJ f(x) dx exists and if there is 
a function F(x) whose derivative is f{x), then 

mdx^ m- m ► 

The statement of this theorem is remarkable If / is mtegrable, all we need 
do m order to evaluate its integral is to find another function, any one at all, 
whose derivative is f. The rest is a matter of simple computation Before 
proving the Fundamental Theorem, let us first apply it to the problem of 
computing the area m Figure 2 

Example 1 The function g(.x) = 4 — x 2 is integrable, as we have already 
seen The area m Figure 2 is thus equal to 

J ^ g(x) dx = J ^ (4 - x 2 ) dx 

Since G(x) = Ax — %x 3 is one antiderivative of 4 — x 2 , the desired area is, 
according to the Fundamental Theorem, 

GO.) - <7(- 2) = 4(2) - y - [-K-2) - = f 

verifying the computations in Section 14 1 ► 

PROOF OF THE FUNDAMENTAL THEOREM Let / = (b ~ a)/tt and 
define points f 0 , t \ » ti, , (Figure 12) by 

to = a, ti — a + /, /2 — a + 2r, , i„ ~ b 

By successively adding and subtracting the numbers F(t 0, F(t 2 ), , F(/,.- i), 

the difference 

F(b) ~ F(a) = F(t„) - F(t „) 


FIGURE 12 



144 FUNDAMENTAL THEOREM / 407 



PRO BLEMS 

Find the following antiderivatives 


lx + 3 ■ 




*2 + 2x + 1 

Problems 4 > 9 * and 12 » multiply out first In Problems 13, 20, 
an 22 , perform the indicated division first In Problems 7-9, II, 12, 17, and 
, appy Formula 5 in Table 1 In Problems 10, 14, 15, and 21, apply 
Formula 7 in Table 1 ) 


14 4 FUNDAMENTAL THEOREM OF CALCULUS 
OR INTEGRATION MADE EASY 

To evaluate an integral d.rcelly from the definition in Sect, on 14 2.5 ordinarily 
an arduous task. However, by making a slight restrict, on on the ktnds of 
functions for winch we try to find mtcgrals, we ean make the problem of 
evaluation relatively easy, while at the same tunc losing almost no practical 
applicability The result which simplifies the computation of integrals also 
establishes the link between the two mam operations of calculus-differentia- 
lion and integration. 


406 / INTEGRAL CALCULU 



The result of Theorem 1 in Section 14.3 indicates clearly why, in the 
statement of the Fundamental Theorem, no particular antiderivative F of / 
was required in the computation 

j [ /(*) dx — F(b~) — F(a) 

For if G is any other antiderivative of /, then there is a constant C such that 
<j(x) = F(x) + C for all x. This being the case, we find 

G(b) - G(a) = [F(6) + C] - [F(a) + C] = F(b) - F(a ) 

The constant Cis simply added in and subtracted out, the actual value chosen 
being of no consequence. 

It is convenient to use the notation 

to denote the difference F(b ) — F(a) which appears in the Fundamental 
Theorem. That is, F(x)]£ indicates that the function F is to be evaluated 
first at 6, then at a, and the difference computed. 

Example 3 We know that D sin x — cos a . Thus 

r‘- i" 2 

1 cos a ax — sin a 
J o Jo 

= sin ^ — sin 0 = 1 — 0 ~ 1 

The area under the cosine curve between zero and tc/2 is unity as shown in 
Figure 13. ► 


FIGURE 13 



may be written in the form 

J-(6) - Ha) = Iff/.) - F(<»- 1 )] + U-0,,-1) ~ FOn- 2 )] + ' ■ 

+ [F((0 - F(to )] 

= £ [F(/0 - F('i_.)l (7 > 

l=»l 

Since / is the derivative of F, the Mean Value Theorem guarantees that 
within each interval [/*_!, /a] there is a point z k at which 

F(t k )~ F(i k _ l )=f(z k )[t k ~ h-i] 


Substituting this into (7) gives 

F(6) - F(a) = ± /(z t )[a - /i-xl = ± /(z,) 

A=1 L~1 n 

If, as in Equations (3) and (4), we choose x k and y k to be points in the 
interval [f*. — 1 * /*] at which/takes its largest and smallest values, respectively 
(Figure 12), then 

f(yJ </(n) </(**) 

which means that 


or, cquiNalently, 


/M 


b — a 


m„ < F(b) - F(a ) < A/„ 

for all values of n Since the number F(b) - F(a) is trapped between 
corresponding members of the sequences ( M „ ) and (m„), and since both 
these sequences converge to the value or the integral, it follows that 

F(b) — F(a) = J f(x)d* 

This coniptcUs the proof 


Examplo 2 La F(v) - Jv 2 + x and /(e) = e + l. S mce F'(.e) *= /to 
it follows from the Fundamental Theorem that 


L /(V) ' /V 


v + I)c/v 

* r(l) ~ F (°) — (} + I) — (0) =* } 

This reprints the area in Figure I and verifies our previous calculations ► 
403 / INTEGRAL CALCULUS 



x — X 


9. 

IX. 

13. 


i 


f 

Jo 

2. / 

r3 o 

“/ffr 


. sin 2x 
3 i — : r — </y 


1 4- sin 2 x 
12. / sin2xrfx 

dx 


15. Find the area bounded by fix) = 3x z and g(x) = 16 — x 2 


14 5 BASIC PROPERTIES OF INTEGRALS 

In order to apply the Fundamental Theorem m the examples of the preceding 
section, we tacitly assumed that the functions encountered were integrabJe 
Since Example 6 of Section 14 2 contains a function which cannot be inte- 
grated, we naturally ask the question “What functions are integrable?” 
A partial answer, but an extremely important one, is contained m the 
following theorem, which we state without proof * 

Theorem 2 (a) All continuous functions are integrable Specifically, if the 
function / is continuous throughout the interval [a, b], then f{x) dx 
exists 

(b) If the function f is continuous in [a, b ] except for a finite number of 
points of discontinuity, then JZf(x)dx exists ► 

Integrals which are not covered by Theorem 2 together with the Funda- 
mental Theorem rarely arise in practice By relying on the Fundamental 
Theorem we shall lose little of essence for applications while gaining a great 
deal in simplicity Thus, although our theorems are actually valid in a broader 
sense, from now on in tins book ne shall consider only those cases to which the 
Fundamental Theorem , at conjunction with Theorem 2, may be applied One of 
the immediate benefits of this approach is that properties of integrals may be 
demed directly from already familiar properties of derivatives 

Theorem 3 (a) If/(x) is integrable and e is any constant, then cf(x) is 
integrable and 

f" cf(x)d\ = c 

•The proof of this result is bo ond the scope of the present text A good source is barite. 

R G. Hie Eicnutus of Riot Analysis (John \\ ilc> & Sons, Inc , New York, 1964}, 
pp 23J-2S4 Rough!}, the argument rots on ihc fact that points close logctlicr m the 
donum of a continuous function generate functional values which arc close together 
Hence, by taking n large enough, the quantities W« and ni m will iuve values which arc 
ncart> the same 


14 S PROPERTIES OF INTEGRALS / 411 



Example 4 The area lying between the graphs of the parabola f(x) = x 2 
and the line #(a) = x + 2 is shaded in Figure 14 By definition, the required 
area is gi\en by the integral 



Evaluate the integrals in Problems 1-14 


3 * *(*" — 2)~ t/K 

5 . / - * • dx 
Ju Vat- -f- 3 

7 * / sm^xcostJx 
Jo 


2m O’ 3 + 2y — 6) dy 

*' f' V57TT </ ' 

6 f!~^ 

8 - J o B + 


410 / INTEGRAL CALCULUS 



Example 1 


(a) I O 7 - 3t' + U)dl = j I 7 tit - j' 3/ 2 dt + J 


2 8 — 1 8 


d\ 


— (2 3 — I 3 ) -t- 1 1 - 2 — 1 1 • I 

= 32- J-7 + 11 = 355 

(b) f (2x+1)(x 2 -1 fdx=[ 2x(x° - l) 3 dx + f (x 2 - 1) : 

Jo Jo Jo 

In the first integral on the right, we recognize 2x as the derivative of v 2 — 1. 
Recalling that an antiderivative of [/(a)] V^a) is [f(x)] n+1 /(u ~f 1), we 
have 

In the second integral we simply multiply out to obtain 

J (v 2 - l) 3 tfv = (v° - 3\ J + 3x 2 - 1) dx 

= i- |+ l- I = -^ 

The original integral is equal to 

I i r ?<i_ 


Theorem 5 IT a < c < b and if /(x) is integrablc in each of the intervals 
[tf, 6], [a, c], and [c, 6], then 

£ /(r)rA = I* fU)ih + j' f{\)dx (8) 

(In terms of area, if f(x) > 0 for all \ m [a, b], Lquation (8) sa>s that the 
total area of the region shaded in Figure 15 is equal to the area shaded to the 
left of the point c plus the area sh ided to the right of c ) 

PROOF If F is an antiderivative of f. then 

j f(x)Jx » F(b) - F(u) ~ [Hh) - HO] + I HO - HO] 

- / /<*)«/> + /H)Jx ► 

ms pnopumes or umcHAts / to 



In words, the integral of a constant times a function is equal to the constant 
tunes the integral of the function 

(b) If /(*) and g(x) are both integrable, then f(x) + g(x) is integrable 
and 


J [/(*) + g(x)]dx - m dx + g(x) dx 


That is, the integral of a sum of two functions is equal to the sum of the 
integrals of the individual functions 

PROOF (a) Suppose 

[ f(x) dx = F(b) - F(o) 


where F(x ) is a function whose derivative is / (x) Then the derivative of cF(x) 
is c/(x) and it follows that 

j [ cf(x)dx = cF(b) - cF(a) = c J f(x)dx 

(b) Let 


\ \ fMdx = FQ>) - F(a) and J g( x )dx = G(b) - G(a) 

|Vhere F’(x) = /(*) and G\x) = g(,x) Then, if H(x) = F(x) + G(x), we 
liave H (x) — /(x) + g(x) so that 

J. 1/W + SWl clx =«(*)- «(<■) = F(b) + C(6) - [F(a) + C(n)] 


= J a f(x)dx+ J g(x) clx ► 

Theorem 3 can be extended by induction to linear combinations of any 
ni e num cro unctions The proof of the next theorem is left as an exercise 

Theorem 4 If/i(x),/ 2 (v), , f n (x) are n functions integrable on [a, b\ 

ana it c it £■■>, , c, are constants, then 

J a t'l/iW + Ce/.M + + c„f u (x ) ] dx 

= c < f" AM dx + c, J /_(*) dx + 


412 / INTEGRAL CALCULUS 


+ c„ f f„{x)dx ► 



b is 


y(b ) - An) = ( 2xe dx - [ 
Jo Jo 

-f 


2xe x dx 


2xe * dx 


Since V(—e x 2 ) = 2xe * 2 , this latter proportion is 


'I- 


PROBLEMS 


1. Find the area bounded by the vertical axis and the functions f, g, and h 
defined by [f (x)] 2 = 2x, g(x) = 1, and h(x) ~ 4 

2. Prove that if f(—x) — f(x) for all x , then 

J 'ftodx = 2 J o ftodx 

3. Show that if /( a — x) = /(a) for all at, then 

J g f(x) dx = 2 ^ /(*) dx 

4. Prove Theorem 4 


14.6 APPLICATIONS 

The definite integral 


/: 


f(x)dx (3) 

is defined (Section 14 2) as the limit of a sequence of sums of the form 

1-1 " 

On the other hand, the Fundamental Theorem (Section 14 4) relates the 
integral (9) to an antidcrisatise F of/ by 


f. 


/( v) dx = f(b) - F(a ) 


An important point of view for applications of the integral is obtained from 
this result if we define a function G by 


G(v)= / 


14ft APPLICATIONS / 41S 



FIGURE 15 



Thus far we have defined the integral £ fix) dx only when a < b It is 
convenient to extend this definition to allow any relation between a and b 
We do this by requiring that Theorem 5 (Equation (8)) hold for all possible 
choices of a, b> and c In particular, if a = b = c, we have 

j fix) dx = J fix) dx+ j fix) dx = 2 fix ) dx 

so that /“/must be zero for any choice of a Choosing a — b and a < c 
gives 

f a fix) dx = £ fix) dx + ^ fix) dx 
Since J“ fix) dx = 0, this means that 

f c fix) dx = - J' fix) dx 
These results are summarized in the following definition 

Definition 1 

(a) J fix) dx = 0 for all a 

f b 

(b) If / fix) dx exists, then 

//(*)<&■> -//(*)*( ► 

Example Z Suppose that y(z), the proportion of a population scoring less 
than z on a behavioral measurement, is given by 

;00 = / 2xe~ x * dx 
Jo 

Then the proportion scoring between a and b is the proportion scoring below 
6 less the proportion scoring below a That is, the proportion between a and 


414 / INTEGRAL CALCULUS 



capital stock formation m the nth year is 


i: 


(2 1 1 '- + 3) c it 


it 312 + 3/]" 


= SK« + O 3 ' 


-]+ 3 


Example 4 Suppose that the number N(t ) of animals which remain in a 
conditioning box / seconds after the onset of shock is given, for t > 0, by 

N(t) = N 0 e~ u 

Here, \ > 0 is a constant and N o denotes the number initially in the box 
We wish to find the total number A(T) of animal-seconds spent in the box 
by time T 

To approximate A(T), we partition the time interval [0, T] into n smaller 
intervals, each of length TJn (see Figure 16) If n is large, the number of 
animal-seconds spent in the box in the kth short interval [(A: — 1 )T/n, kT/n ) 
is given approximately by the product (T/n)N(kT//i) of the number N(kT/n ) 
of animals in the box at time kT/n times ihe number T/n of seconds which 
elapse w the interval (This product is also the area of the dotted rectangle 
in Figure 16 ) The total number of animal-seconds for the entire interval 
[0, T\ is given approximately by the sum 



obtained by adding up the approximate numbers for the n small intervals 
FIGURE 16 



Asm—* x, the sum (10) should conurgc to the actual number of animal- 
seconds spent in the box But the limit 


hm 



146 APPLICATIONS / 417 



G is simply a function of the upper limit of an integral Thus we have 
G( a ) = F( x ) — F(a ) and G(a ) = 0 Since F is an antiderivative of /, we 
have DG(x ) = DF(\) = f(x ) for all x, from which we learn that G is also 
an antiderivative of/ Put another way,/ represents the rate of change of G 
The equivalent integral notation 

D I* f(t)dl =f(x) 

produces a still different sounding result The derivative of an integral 
with variable upper limit is just the integrand evaluated at that upper limit 
Thus we have a number of different ways of looking at integrals As the 
following examples show, each way leads to important applications 

Example 1 We know, of course, that the speed at which an automobile 
moves represents the rate of change of the distance traveled It follows that 
distance is the integral of speed Specifically, if u(/) represents the velocity of 
an automobile at time /, then the distance covered between times t — a and 
/ «= b is 

D(a, b)= J i,(|) dt 

To take a particular cdse, if l(/) = t -}- the distance covered between 
t <= 0 and t = 1 is 

m,)= /o <« +»*-£+«[-! 

(Compare with Example 2 of Section 14 4) ^ 

Example 2 Suppose that the marginal cost function for a certain com- 
modity is/(y) = ax where * denotes quantity produced and a > 0 is a 
constant Since marginal cost is by definition the derivative of the total cost 
function C, the total cost to produce x units is 

CM = C„ + £ /(,) (i, = c 0 + £ at 112 dt 

where Cy is the cost associated with zero production ► 

Exampta 3 The amount of capital stock C is related to the rate of net 
unestment / by the equation OC(r) = /(r) Thus, ifthe rate of net investment 
at time / is given (m thousands of dollars per year) by /(/) = 2/ ,/2 + 3, the 


416 / INTEGRAL CALCULU 



represents the rate of change of income at time / That is, if income were to 
remain constant for one year at the rate A(t), then A(t) dollars would be 
received m that year ► 


Example 6 Net savings resulting from a proposed manufacturing facility 
are assumed to be a continuous flow at a constant annual rate of A dollars 
If the market interest rate ; is assumed constant, the present value of future 
savings to be realized over T years is, using (13), 

r T 

P — A l e~ r 'dl = -(1 - e~ rT ) 

JO r 


Suppose that construction cost C(/) increases with the durability of the 
materials used according to the function C(/) = kt lf2 , where k is a constant 
and t denotes durability in years Then the optimal life T* of the proposed 
facility is that value of T which maximizes the ratio of present value of 
future savings to cost That is, T * is the value of T which maximizes 


J?(T) = 


A ( I - e~ rT ) 
rkT‘1- 


Applying the Interior Maximum and Minimum Theorems (Theorems 14 
and 15 of Section 12 8) we find that R(T) is maximized when e rT = 1 -J- 2rT 
The optimal life of the new facility js given approximately by 

T* = 1 25/ 1 ► 


PROBLEMS 

1. Verify the result T* = l 25 fr quoted tn Example 6 

2. Suppose that the marginal rate of information transmission between the 
input and output of a complex system is given by Aii~* where A and ji 
are constants and x is a measure of system complexity If f(v) denotes 
ihc amount of information transmitted by a system of complexity x , 
and if F(0) = 0, show that 

FM “ ~ *“'J 

In ii 

3. Assume that total world population N can be* de*scribed at any time / b> 

N{t) = iVo e» 

where* \ is a positive constant and Mi denotes population sue at an 
arbitrary reference point called t = 0 Assume further that Man's history 
to date* may be characterized by a life expextancy of 25 >ears Show that 
the total number G( 7*) of people who have lived on the earth during (he 
ume period (0,7*1 is approximately (Mi/ 25\)(e xr — 1J (Utnt Let 
r « 25n, where n is an integer Argue thjt G(J ) is given approximately 
b> 532-1 M»e* and that this sum is hi turn approximated by 
3*5 /if ATiie 4 * Jt ) 


146 APPLICATIONS / 413 



is the integral 


[ N(t) dt 
J o 

(that is, the area under N(t) from / = 0 to / = T) Hence 
f T 

A{T) = / N 0 e~ u ill = - ^2 e- 1 ' 1 
Jo A Jo 

= f V-e-* T ] 

The eventual number of animal-seconds spent in the box is 

llm A(T) = ► 

T — >05 A 


Example 5 An amount A 0 of money invested at r percent compounded 
annually becomes A(t) = (1 -f- r) l A 0 in t years If interest is compounded 
n times per year, this formula is replaced by 


AU) = (' + «)" A ° 

Since llm„«„ (1 + = a ’ (see Problem 8 in Section 5 4), we see that 

as n — * co (that is, if interest is compounded continuously), A(l) becomes 


A (0 = e"A„ ( 11 ) 

va ' ue of an amount -4(0 to be received t years in the future 
From mw 5 the am0 “ nt wfuch > “i^ted now, would y.eld A(l) at time t 

ner vear th V' 6 ' .** 'i mterest ls cora P°unded continuously at r percent 
per year, the present value of an amount A(l) due in , years is 


A o ~ e rl A(t) 


The present value 
respectively, in I, 2, 

P(T) = 


of a stream of incomes .<4(1), A (2), 
, T years is thus given by 


e r rl(l)+e- 2 ^(2)+ + e~ Tr A(T) 

Ert( o 

i=l 


, A(T) due, 


( 12 ) 


If wc think of having partitioned ihe interval [0 n 
each of unit length, the sum (12) may be interpreted 
to ihe integral 


into T smaller intervals 
as a sum corresponding 


Jo 


‘ A(t)dt 


( 13 ) 


mt C omTrbc < re ) ceT y i‘ tSC ' f bC ,M, ? prcted "’ C preKnt vaIue ofa slrcam of 

income to be received continuously over Ihe lime period [0,71, where A(0 
418 / INTEGRAL 


CALCULUS 



(c) A laborer, in pushing a cart containing a leaking container a distance 
of 80 feet, loses \ of the contents on the way The cart weighs 50 
pounds, the container 20 pounds, and the contents originally weighed 
160 pounds Assume that the force required to push the cart is equal 
to 3 of the total weight and that the amount which has leaked out is 
proportional to the distance traveled Find the work done in moving 
the container 


9. In a model of extinction, Bush and Mosteller* derive the equation 


dp 

dt 


~ —nbp~ 


where w and b are constants Given the initial condition p(0) — po, 
integrate to obtain the theoretical extinction curve p(t) 


10. As a further elaboration of their model (Problem 9), Bush and Mosteller 
show that the response rate is 


*'(0 - ! 


WpQ 

1 + Mb pot 


where R(i) denotes the number of responses emitted by time / 

(a) Find R(t) 

(b) Find the number of responses emitted during the time interval from 
t = 10 to / = 50 


11. In a waiting-line modU.t Y, the time spent waiting in line for service, 
is approximated by 

— +!)/[(!/ D-dlSU r 


■/; 


+ , + ' (S-s)]* 


and /F, the service time, is approximated by 


■f. 


-u+n/uw o-aisii 


(!)• 


where P, A, A , and S are constants Find >' + W, the cumulative waiting 
lime for the entire cycle 


14.7 INTEGRATION BY PARTS 

The discussion in this and the following two sections will be conceTrted with 
techniques for finding antiderivatives, and hence for evaluating integrals, of 
certain types of functions which often arise m applications of the integral 
concept 

•Buih, K It. and Mosteller. I . “A MaOwmatical Model for Simple Learning,” Ps)* 
rWmw/ Kecun 53, 313-323 (1951) 

t Adapted from \\ J. and Turpi run, 1*. L. Opsruhont Lct*n »> luhiinA 

Appi ctiUiin i/ O/vm/iihi Ktuarch (IVuUkc Hall, Inc . Inglewood Chits, N J , 1906), 
Ctuticr 12 


147 INTEGRATION BY PARTS / 421 



4 Suppose that when an intercity passenger bus has been driven m hundred 
thousand miles, average repair costs in dollars per mile are given by 

R(m) - Ant 1/2 

where A is a constant Find the average total repair cost for a bus which 
has bt,en driven 400,000 miles 

5. Upon the successful completion of a research project, each of N staff 
members reports, to the nearest day, the total time which he spent on 
the project If the proportion pit ) of the staff who report exactly t days 
spent on the project is given by 

p{t) = for t = 1, 2, 3, , 50 

and 

Pi 0) = 1 - [p(U + Pi 2) + + P(50)l 

find 

(a) the exact labor cost (in summation form), assuming labor is billed 
at S100 per man day 

(b) an integral approximation to the exact cost 

6 The marginal direct labor function for a certain chemical plant is 


where Y is direct man hours, x is quantity produced (100 tons), and 
K > Q and r(0 < r < 1) are constants Find 

(a) the total man hours required to produce quantity x'o 

(b) the average direct labor per 100 tons required to produce quantity xo 

7 Suppose in Problem 6 K = 1000 and r = 0 322 Find the total man- 
hours and the average direct labor per 100 tons required to produce 

(a) the first 1000 tons 

(b) the second 1000 tons 

(c) the third 1000 tons 

8 If a constant force / is applied through a distance d, the work done is, 
by definition. 


If the force changes continuously over an interval from point a to point 4, 
»ith the force at v G [a, 4) bang /(*), lhc WO rh done is 


W “ /. /W d ' (15> 

(j) Show that if /(at) is constant over the interval [ir, 4) then Formula 
(15) becomes Formula (14) 

(b) By partitioning the mtcrval [a, b) into n smaller intervals, using (14) 
to approximate the work done in each small interval and taking 
limits as n -* « , dtmt Formula (15) 


420 / INTEGRAL CALCULUS 



Note that the advantage of the integration-by-parts formula comes in the 
replacement of /(/) = t by /'(/) = 1, resulting in an integral that is easily 
evaluated. ► 


PROBLEMS 

1. Verify the final calculations in Example 1. 


Example 2 Let E(t) = B(t) — G denote the deviation of system behavior 
B(t) at time t from a goal G. If E(t) — e~ al sin bt , where a > 0 and b are 
constants, then the cumulative deviation from the goal over the time interval 
[°> T] is r 

/ E(t)dt = / {e~ at sin bt)dt 
Jo Jo 

This integral may be evaluated by integrating twice by parts. First, let 
fit ) = sin bt and g'(/) = e~~ at in Formula (16) to obtain 

J e~~ at sin bt dt — — ^ e~ at sin — j ~ e~ at b cos bt j dt 

r T 

b - e- 
a Jo 


= - ie- ,,T smM'+ ■ 


'cos btdt 


< 17 ) 


Next, put fit) = cos bt and g'(/) = e at in (16) to obtain 

, ,* . f T » 

1 cos btdt ■■ 


b -f 

a Jo 


' - - ^ °' C0Si 'I - b af a [~a e Sin *'] 


dl 


T oos bT)- b J_f o 
Inserting (18) into (17) and solving gives 




e at si nbtdt (18) 


Jo 


sin bt dt = 


b — e~ aT (a sin bT -f- b cos bT) 
a 2 -h b 2 


PROBLEMS 

2. Check the calculations in Example 2 


Example 3 Many formulas listed in a table of integrals* express the 
integral of a function which involves the nth powers of some expression, in 

•Extensive tables of antiderivatives (integrals) are available which include all the cases 
wc will consider plus many more. Recourse to such a tabic is a must for many problems 
involving integrals. One of the most widely available sources is the C.R.C. Handbook, of 
Standard Mathematical Tables, published by the Chemical Rubber Publishing Company, 
Cleveland, Ohio 


14.7 INTEGRATION DY PARTS / 423 



The first of these techniques, integration by parts , is merely a restatement 
of the formula for differentiating a product If/ and g are two differentiable 
functions, we know (Section 12 2) that 

D[f(x) g(x)] = f(x)g'(x) + g(x)f'(x) 

In terms of antiderivatives, this is equivalent to saying that 

J f(x)g'(x) dx + J g(x)f'(x) dx =/(x) g(x) + C 
or, rewritten in the form that is usually called integration by parts 

//(x)g'(x)rfx=/(x) g(x)~ I g(x)fXx) dx (16) 

In terms of integrals, this latter formula reads 

f a /(*)«'(*) dx = f(x) g(x)J^ - j g(x)f'(x) dx 
An important special case of (16) is 

J Ax) dx = x/(x) - Jxf'(x)dx 
obtained by taking g to be the identity function g(x) = x 


Example 1 Suppose in Example 6 of Section 14 6 that future savings 
decline according to the linear function A(t) = a — bl where a and b are 
positive constants Then the present value of savings recouped over T years is 

P= f (a- bt)e~ Tl dt = f ae~" dt - b f ur" dt 

■'0 J o 

The first of these integrals is easily evaluated as 


f 

Jo 


ae Tt dt = — - e 


: (1 - e~ rT ) 


The second may be integrated by parts by taking/(,) = t and g'(t ) = 
in Formula (16) We obtain K ) 8 W 


- -Te~ rT + ±n . 
r r f -2 '•* 


For the original integral we have 


■ f Q (.<*- bt )e Tl dt = ~ ( ar - brT — 


W 




422 / INTEGRAL CALCULUS 



8. The following is known as Wallis’ formula for rr:* 

(a) Use the reduction formula m Problem 7 to show that 


(b) Let 


r */2 r J2 

I sin xdx / sin 

Jo n Jo 


xdx 


sin" xdx 


Show that So = sr/2 and Si = 1 
(c) Show by induction that 


2/i - ] 

I 2//-3 

3 

1 - 

2/i 

2/i - 2 

4 

2 2 

2/i 

2/i - 2 

4 

2 

2/i+ 1 

12/r-l 

5 

3 


(d) Show by induction that the sequence (S„) is monotone decreasing 
{Hmt 0 < sin x < 1 ) 

(e) From (d), argue that 

2/i „ 2/» 


2 «+ 1 

and hence that 


S 2m < 


2/i + 


r S 2 n-\ = S 3t 


< S> a 


. S Sm ^ 1 

hm — — = 

■ _x 02 n 


I 


(f) By using (c) to actually write S», and then appl>ing (e), 

show that 

2 • 2 4 • 4 6 6 (2/i)(2//) r 

1 3 3 5 5 7 ' (2/i - 1)(2 n + 1) “ 2 

(g) Compute 2(S*i/Ss) and compare the result with r = 3 14159 

9. Obtain the reduction formula 

/ r / . sm r+l vcos' -1 r / — 1 f 

sin v cos v dx r — - — / 

r + / r + r J 

10. Use the result of Problem 9 to show that 
sm 1 reos' xdx =* sin ' cf^cos 1 


sin v cos ' xdx 


/• 


h cos" x + j*) - 


11. Rework Problem 10 bj first writing cos* v = (1 — sin- r)*, multiplying 
this out, and using the formula 


/• 


•After John Wallis (1616-17U3) an Foolish mat!<cnuucun who helped la> the ground* 
woik for ihc Liter development of caLuIus hj Isaac Newton 


147 INTEGRATION BY PARTS / 42S 



terms of the integral of a function involving (« — l)st or lower powers of 
that expression These recursive formulas are called ; eduction formulas 
They are usually obtained by integration by parts 
A typical illustration is the formula 

f n , Sin X cos” -1 * .71—1 f n—2 
1 cos x dx = - 1 — J cos * dx 

valid for n > 2 It may be derived by taking f(x ) = cos” -1 * and 
g'(x) = cos x in (16) to obtain first 


I cos" x dx = cos’* 1 x sin x + (« — 1 ) I sin x cos" 2 x sin x dx 


Writing sin 2 x = 1 — cos 2 x in the integral on the right side gives 
J cos" xdx = cos n— 1 x sin x 

+ (» — 1)^ cos" -2 x dx — (n — I) J cos’ 1 xdx 


Solving for 


yields the stated formula 


/cos” 


xdx 


PROBLEMS 


3. Find the following antiderivatives 


(a) J 

f ^ In x dx 

(b) 

J y n In y dy (c) J sin 1 xdx 

(d) j 

f x e~ z dx 

{Hint 

Integrate by parts twice ) 

(e) _ 

1 ^ S ‘ n x i/X 

(■ H "« Ut/Ct) _ I.n (16) / 


4. Evaluate the integrals 


(a) I 2x c dx (u\ f 2 , 

Jo to ) x sin x dx 

Jo 

S. In Example 2, find the cumulative deviation from C in the interval [0, T I 
if £(/) = c~ al cos bt 


6 

7. 


Modify Example 1 by patting _ ,„y F] „ d the value 

ot the savings generated over T>ears 

Use the method of Example 3 to obtain the reduction formula 




424 / INTEGRAL CALCULUS 



2. How should the / and g functions be chosen in (19) to obtain Line 7 of 
Table 1 *> 


Example 2 To find 


f JL. 

J 1 + -v* 


dx 


let g(x) = 1 + x i and f(y) = In y Then g'(x) = 4x 3 , f'(y) = l/y, and 
/'(«(.')) = 1/(1 + x 1 ) Hence 

J dx = i jf'(g(x))g'(x)dx 

= +C=lln(l+x l )+C ► 


At the risk of introducing some rather questionable notation, we can 
provide a mnemonic aid to the use of the substitution method First, recall 
that r 

j f‘ \u) du = /(«) + C (20) 

If we use the notation u = g(;t), so that du/dx = g'Cx), and then treat the 
derivative du/dx as though it were a iatio of two quantities du and dx, we 
obtain du = g'(x) dx A glance at (19) shows that with the substitution of u 
for g(x) we can write 

f f(g(x))g'(x) dx = J f'(u) du 

Using (20) this becomes /(«) + C and, on replacing u by g(x), yields the 
correct answer / (g(x)) + C 

Although the procedure of treating du/dx as a ratio is questionable from 
a rigorous point of view, it does work and eliminates the need for thinking 
through the Chain Rule each time We shall use it m all our future discussions 


I-. 


Example 3 To compute the antiderivative 

~d\ 

Vx — 1 

we substitute u = (* — 1) ,/2 so that x = ir -j- 1 Then dx = 2 udu and 
x 2 = u 4 + 2«“ + 1 This gives 

f -~==dx = f--~ f -~ 2udu = 2 J (id + 2,r + 1) du 

= Im"* + j m 3 + 2 «+ C 
Replacing u by (x — I) 1/2 gives 

f ~=.dx - - 1) ,/2 + i(x - I) 3 ' 2 + 2(x - I) l/2 + C ► 

J V v — I 


148 INTEGRATION BY SUBSTITUTION / 427 



to obtain 

J sm 4 x cos 5 A dx - i, sm 5 x-f sin 7 x + if sin 9 x+C 
Which method do you prefer 9 


14 8 INTEGRATION BY SUBSTITUTION 

A special technique of integration called integration by substitution is derived 
by inverting the Chain Rule (Section 12 4) 

D[f * g(x)} = f'(g(x))g'(x) 

to obtain the equivalent form 

//'(«(x))g'to * = /( g(x)) + C (19) 

in terms of antiderivatives A few examples will help to show how the /and 
g functions should be chosen in the substitution method 

Example 1 To evaluate the integral 

f x(* 2 +l fdx 
Jo 

we substitute g(x) = x 2 + 1 and f(y) = Then g (x) = 2 x, /'GO = y* > 
and (x 2 4- l) 2 = / (g(x)> Applying Formula (19) gives 

/„ x(x s + 1 fdx= l 2x(x 2 + l) 2 rfx = l /'(g(x)) dx 

= i/(g(*))l = i SO 3 + I) 3 ] 

Jo Jo 

- cP 3 - 1 3 J = £ 

This value may be verified by writing ( x 2 + l) 2 = x 4 + 2x 2 + 1 1° 
obtain 

/„ c(x" + 1 ) 2 dx = (X- + 2x 3 + x) dx = ix” + Jx 4 + lx 2 J a 

= l+i+i=i ► 


PROBLEMS 

1 The method used in Example 1 is actually just another form of Line 5 
m Table 1 m Section 14 3 How should the / and g functions be chosen 
in (19) to obtain the formula in Line 5 9 


426 / INTEGRAL CALCULUS 



How should the / and g functions be chosen in (19) to obtain Line 7 of 


Table 1 9 


Example 2 To find 


/ 


1 + v 4 


7 (IX 


let gW = 1 + jc 1 and f(y) = In)' Then g'( a) - 4x 3 , /'O’) V* and 
/'(g(v)) = 1/(1 + ’t 4 ) Hencc 

f -Jt-jdx = i //'(sM)«'M rfx 

' + = if( g(x)) + C = 1 In 0 + * 4 ) + C * 

, crime rather questionable notation, we can 

At the risk of ‘"^^^^“^rtesubstltution method First, recall 
provide a mnemonic aid to tne us 

that f , ,Mi.r (20) 

J/'(i i)du */(“) + C 

If we use the notation u = «(*)> S ° ^nt’itiet du and dx, we 

“ f'^r/ltanc’eTt (19) shows that with the substitution of « 
for g(x) we can write . 

[f(g(x))g'(x)dx = ]/'(»)* 

, n , f and on replacing u by g(x), yields the 
Using (20) this becomes /(«) + c and ’ 

correct answer /(gW) + c , , ratl0 ls questionable front 

Although the procedure of treatingAi/* as a rau ^ thmklng 

a rigorous point of view, it does wor our future dlscuss , on s 

through the Chain Rule each time We shall use 

Example 3 To compute the antiderivative 

- dx 


I: 


r = + 1 Then dx = 2 udu and 

we substitute u — {x ~ l) so 1 
* 2 = « 4 + 2« 2 + 1 This gives 

f x * f ul±2u_±±2udii = 2 0< 4 + 2« 2 + Odw 

= lu 5 + + 2 « + C 

Replacing u by (a. — l) w " 6 ives 

/ 


VT 


= dv= ;(v - if' 2 + 


if 


+ 2(x - O 1 ' 2 + c 


148 INTEGRATION RV SUBST.TUT.ON / «7 



Example 4 To find 


/it?* 

let u = x 2 and du = 2xdx Then 

From a table of integrals* we read that 

f -r4 — ; du = tan ' 1 u + C 

J 1 + u~ 

and, on substituting x 2 for u, 

[ , dx = \ tan -1 x 2 + C 

J 1 4 * x* 

This example should be compared with Example 2 ^ 

Example 5 If in Example 4 we make the substitution u = x 4 , then 
du => 4x 2 dx and xdx = du/4x 2 = du/4\/ u The integral becomes 

f i — 4 dx ~ f — du 

J 1 + * 4 J 4 y/u (1 + u) 

and our substitution has created a problem more complicated than the 
original one Skill m finding the proper substitution comes with practice ► 


Example 6 Hechtt assumes that the photochemical process by which a 
retinal substance S decomposes when exposed to light of intensity / 1S 
characterized by the equation 

^ = < 21 > 
Here a denotes the concentration of S prior to stimulation, b and k are 
positive constants, and y(t ) is the amount of substance S remaining at time 
/ Rewriting (21) as 

y'O) __ . 

~k[y 2 -f {bl/k)y - bla/k] 1 

and completing the square in the denominator gives 

~k{ o 1 + bl/lky - lla/k - d ‘ = /* * = < + Cl 

*Set footnote on page 423 

tllccht, S , * Vision II The Nature of the Photoreceptor Process ” in A Handbook of 
Gencrut Cxpenmintat Psychology, C Murchison, Ed (Clark University Press, Worces- 
ter, Mass , 1934) 


428 / INTEGRAL CALCULUS 



Tins may be amplified by substituting u that „'« = /(O, 

and defining a constant B by B — oia/K -t- 
resulting form I f u'(r) 

— k j ti — B 2 

we use Figure ITtomakc a In^onlCric subs.^on of t^rmcos . - B/u 

and tan 0 = • tan 0 B0 and the 

and rfii/d0 = B sec 0 tan 0 Hence u (!) d! 

integral becomes 


_Lf_s 

i. J H- 


’«) 


= 


/ esc 0 r/0 


■ B J 

From a table of integrals, we read 

. fl B0 = -In |csc 0 + cot 0| + C 2 


fc 


, / /“I Ra cot 0 bv B/vV - B 2 , II by y + bl/2k. 

Replacing esc d by u/ v« ^ » 

and solving for j> gives the solution 


y(l) = - 2k 


bl Bfl + 


1 - Ae~ 


i„ he determined by the initial conditions 

where A is a constant of integratio 
and B is as defined above 



PROBLEMS 

3 Complete the details of Example 6 

4 Rework Example 6 by reading 

/ u-^rii- d " 

from a table of integrals Which approach do you prefer > 

♦Actually, the integral ,/(/) ^ = j 1 j„ 

“ The rcLmmt, details have been included to 

may be read directly from an Integra su h s titution 
illustrate the method of trigonometric substitution 

148 INTEGRATION BV SUBSTITUTION / 429 



Using the Fundamental Theorem (Section 14.4), Equation (19) (the 
mtegration-by-substitution method) may be rewritten in terms of integrals as 

f /'(sWVW * = /<sto)] a = 1221 

We also know that 

relb) -. 3 ( 6 ) 

/'(«)</« =/(..) = /(g(6))-/(g(a)) < 23 > 

4 3(a) J ela) 

Since the right sides of (22) and (23) are identical, the left sides must be 
equal That is, 

relb) rb 

/ /'(«) du = I /'( gM)g'(i) dx 

J Ole) Ja 

In this equation, only the derivative /' appears, rather than /itself. Writing 
h = /' to simplify the notation, we obtain the formula 


/ h(i i) r/« =» / /i(g(x))g'(x) (24) 

If the function g is strictly monotone (increasing or decreasing) on the 
interval [a, b], then g has an inverse g -1 . Writing c = g(a) and d = g(^)> 
Equation (24) becomes 

r d re~Hd) 

/ »(«)*<- I , h(g(x)) s '(x)dx < 25 > 

* ,c •'<r , <o 

the form most often used in integrating by substitution. 

Again, the device of writing du — g'(x)dx when u = g(x) simplifies the 
computational problem of actually making a substitution , or change of 
variable in evaluating an integral. Thus, working from the left side of (25), 
we reason ’ as follows. Suppose we make the substitution u = g(x). Then 
du should be replaced by g'(jc) dx and h(u) by h(g(x)). Moreover, the limits 
on the integral should be changed to reflect the fact that the argument of the 
functions is now x rather than u When u = g(x) = c, we have * = g'Kc), 

7 j 6 ! U ~ — We ^ ave x = # X (d). Thus the substitution u = g( x ) 

leads to 

f d ro-'id) 

J e h(u) du = h(g(x))g'(x)dx 

Again, we emphasize that this is a convenient line of reasoning and it works, 
but it is not strictly rigorous 

Example 7 To find the value of 


430 / INTEGRAL CALCULU 



make the tri gonome tric substitution %x — sin 0 or * = 3 sin 0 (see Figure 
18). Then \/9 =* 3 cos $, dx/dd = 3 cos 0, and the integral becomes 


r —L=dx= j 

Jo y/9 - X 2 J o 

-i 


3 cos 0 
o 3 cos o 


do 


I do = sin 1 (J) 


The new limits 0 and sin ’(J), in terms of $, come from solving the equation 
* = 3 sin 0 for B when -v = 0 and when x = 1. ► 


FIGURE 18 



Example a To find 

r'- 

f -j A~dx 

J 312 >/2x + 1 

substitute tt = \Jlx so that .v = t(u 2 — 1) and dx = udu. Then 
Ja/2 \/Zx +1 ‘ “ u 


- du 


= if* <«■’ — 


1 ) du -•» in 3 - 5 


■I 


- 3 - It - H 

is 


Note that in evaluating the integral there i> no need to replace the new 
variable u i>> the original variable r. ► 


PftQBLCMS 

In ftoMviiu 5-2’, tovJ the moated mdxt n.tc u.tc k taU CVn.uft fat «c I *.<• 
a n^rc c\unmc taUc of irtc *xaU onl> if abwJuUl* » 

5 . I - ' Jt Vt-i S-N: tutc a - % r r : ) 

- \ t - : 


14* lUTtCitMlOM BY 5U a 5 TIT U I IO H / 43) 



a, j (y + 2)^3j • + 2 ily (.Hun Substitute u - v / 3}’ + 2.) 

7 f *1+2 - J, 8 . [yVT- 

J (fi + »)>'* J 


■ 2yil> 


(,2 + ,> i/2 

i. [ 3 sm 1 cos ' Ji (Hmt Substitute u = 2 + sin = 1 ) 
J 2 + sin 2 / 

, [ 1 ± 201 * 11. f -+ 

J COS 2 / J A- + 

f 3 sin 2/ 

J 2 + sin 2 / 


+ 25 

dt {Hmt sin 2/ = 2 sin / cos / ) 


13. J - 

A. j C 


t 


V2 - 9i/ 2 

14. f (2 + tan m ) 2 i/m 


</e (//inf Use a trigonometric substitution ) 

15. / ' dx 

J V2 - 4t 2 

16- I — - t/v (Win/ Use a trigonometric substitution ) 

J yVy2 _ 9 

,S. / 

</>’ (Win/ First complete the square ) 

2 , 7(TT*o^ 

22> J/'(* + °) dx {Hmt Substitute y - x + a ) 

Evaluate the definite integrals in Problems 23-28 


17 7 7 v£ 

19./ 


+ 9 


V8 + 2y - y2 
ds 


10 ‘ / (9 - ,2)2 0 


,-W2 

23 2 + 

J,n sin 2 * 

2 A. f d, 

Jo 1 + sin / 

/•W2 

25. ( 2smfl dS 

Jo cos 2 0 + 9 

r W4 

/ sin x 

26 - / 5— </* 

J 0 cos 2 X 

r , 

H 2 t 

Jo (y 2 + 25+/2 dy 

Jl (* + DC* 2 - l) 1 ' 2 d 

Use a table of integrals to find the 
integrals 

following antiderivatives and definite 

29. j sin 4 x dx 

30. J sin 3 x cos 4 x dx 


432 / INTEGRAL CALCULUS 



[wtl 

i 

r 

' sin 2) cos 4 y dy 

31. 1 sin 4 0 cos 4 Odd 

32. J 

Jo 


r 

33. J cos 0 sin 30 dd 

34. j 

f sec 4 0 dd 

35. j tan 3 it du {Hint 

Use the identity tan u - sec" u 1 ) 


36. Verify the following formulas 


w/ 

1 — du = In |« + A 

/ 1|2 + rf| + c 

V it 1 + d 2 


(b) / 

. 1 - du = In | ii + ' 

'</u 2 — fl 2 | + C 

V«2 — fl2 



14 9 RATIONAL FUNCTIONS AND PARTIAL 
FRACTION METHODS 

A rational, function is a quotient of two polynomials Thus 

T and 

ate both rational functions What ££ ^UonaTfuncUons 

integrating rationai functions, functions we 

often arise in practice In our ge numerot or is less than the degree 

shall always assume that the deg ^ can make lt s0 by long division 

of the denominator Ifthisisnott tv 1 — 3x + — 1) 

As an example, upon division the rational function (x 3v + 2)/( 

becomes 


J - 3v+ 2 
x* ' — 1 


2a — 2 
' A- - 1 


the sum of a polynomial and a rational function of the desired type 
The simplest rational functions have the form 
k 


(26) 


civ + b 

. « Ac ,« nvilv verified by differentiation, the 

where a, b, and A. are constants As is easily 

antiderivative of such a function is 

(27) 


u 


_ dx = - In \a\ 4* 6| + c 


The trick in evaluating integrals of rational functions is lo vvri.e^e funU.on 
us a sum of simple functions like (26) A lew exump 
indicate the method 


149 RATIONAL FU 


NOTIONS / *33 



Example 1 (a) The funct.on l/(2x 2 + 3.x - 2) may be rewnt.en 

.. 1 

2F+3l^2 = (STT-IkT+Z) 5 2v-l 5.r+2 
Hence, using (27), 

I 2x ~ + \ ir^ d * = ll2^ jx - l sl^2 d * 

= Jlnl2v - 1| - iln|v + 2| + C 


x-+2x 2 - x -2 (x~~ l)(x + l)(x + 2) 

1 1 11.11 

6 a — I 2i+I t 3 .1+2 


it follows that 


+ 2x-- - x - 2 “ 


= 4ln|x- 1| -iln|*+ I| 
+ Jln|x + 2| + C 


(c) Rewriting 


2x - 1 

2x * - x 2 - Ax + 3 


2x ~ 1 2x — I 

2*3 _ *2 _ 4* + 3 (* - 1)’(2^ + 3) 

8 1.1 l 16 I __ 

25 x 1 ' 5 (a — 1)- 25 2x 3 

we obtain 

f lx r J X ' X 1 **" jr 1 "!*- l|-i— i-r 

J 2x‘ — x 2 — 4* + 3 25 1 1 5 a: - 1 

-i^ta[2x+ 3|+ C ► 

In Example 1 the factors of the denominator of the original rational 
function are directly related to the simpler functions on which the integration 
is performed Although we will not prove it here, this is always the case 
The following summary indicates the general procedure which may be 
followed 


TEGRAL CALCULUS 



Rule to Simplify Rational Functions To find 

fm dx 

J Q(x) 

where P and Q are polynomials with the degree of P less than that of Q, 

(1) factor Q (This may be made easier by recalling that x — a is a factor 
of <2(*) if and only if Q(x) = 0 when x = a) 

(2) if ax + b appears exactly n times as a factor of Q(x), then form the 
sum 

_A__| <k i_ j 

ax + b T (ax + by T ^ (ax + 6)- 

where d it d 2 , , d n are constants to be determined 

(3) Solve for the d 3 

(4) Find the required antiderivatives 


Example 2 As a typical illustration, let us look again at Example 1(c) 
There Q(x) — 2x 3 — x 2 — 4x -f 3 has factors (x — 1), (* — 1), and 
(2* + 3) Thus we write 


P(x) 

Q(x) 


2x s 

di 
x — 


2x - l 
x 2 - 4x + 3 


+ 


d 2 

(* - 1)2 


+ 


2x - 1 

= (* - I)2(2x+ 3) 
2 jc -f- 3 


where the d } are to be determined (Note the inclusion of factors with both 
x — 1 and ( x — l) 2 in the denominator ) Multiplying through by 


(x - \) 2 (2x + 3) 

gives 

2x — l = dy(x - 1)(2a + 3) + d 2 ( 2x + 3) + d 3 (x - l) 2 

Since this expression is to hold for all values of x, we set first x — 1 to 
obtain 2 1 — 1 = d z ( 2 1 -f 3) or d 2 = i Then, setting x = — we 
find 2(— 2) ~ 1 = d 3 (— f — l) 2 or d 3 = To find d u we now set 
x = Oon both sides to obtain 


-1 = -3d l -f 3d 2 + d 3 

Inserting the known values d 2 = £ and d 3 = — gives di - as 
indicated in Example 1 ► 


Example 3 Dodd* assumes that rumors or messages spread through a 
population at a rate proportional to the frequency of contact between those 

•Dodd, S C , * Dillusion is Predictable Testing Probability Models for Laws of Interac- 
tion,” American Sociological Renew 20, 392*401 (1955) 


149 RATIONAL FUNCTIONS / 435 



PROBLEMS 


Find the following antiderivatives and definite integrals 


1. 

3. 

S. 

7. 

9. 


(-, 
I 
i , ; 
/, 
/. 


- 25 


r/x 


r t/x 


x2 - 16 

/ 2 -f 2/ -f 4 
3 2/3 -F 12/2 - 32/ 4 

3n» 


- r/lV 


2 (1 ~ H * 2 ) 2 

2x 2 + 4x + 1 
o (* + 4)(-t2 + l) 


dx 


2 . f 4 - 

J X- — X — 

*-/^* 

J i ^4-r 


r^X 


X* _ 6x2 -f 9 X 


dx 


( Hint Use the form A/(x + 4) + (Bx + C)/(x 2 + 1) where A, B, 
and C are constants Solve for A t B, and C ) 


10. Carry out the solution for y(/) indicated in Example 3 

11. In Example 3, modify the model so that 

= ky(l - y - a) 

where a is the fraction of the population who never hear the message 
Find the time path y(t) 


12. Can the antiderivative 

/x2+ x + 4 ,/r 

be found by partial fraction methods 7 If so, do it If not, use an alterna- 
tive procedure 

13. A growth model posits that the rate of growth of a certain population is 
governed by 

= 7(x - a)(v - /3) 


where a and denote the initial sizes of two interacting subpopulations, 
y is an activity parameter, and a(/) is the total population size at time / 
Find the time path for x(t) 

14. Graham and Gagne* assume that the rate of change of response strength 
R varies with time / according to the equation 

™ = «(i, +k. -kM 

where A i, A_», and A i are constants 


•Graham, C H , and Ga^ni, R M , * The Acquisition Lxtinctton and Spontaneous Re- 
covery of a Conditioned Operant Response,” Journal of t v/h rtitn n la l Psychology 26, 
25I-2SO (1940) 


149 RATIONAL FUNCTIONS / 437 



ttho have received Ibe message and those who have not He proposes the 
diffusion, or contagion, model 


dy 


= ky{ 1 - y) 


( 28 ) 


where X0 and 1 — y(t) represent the respective proportions of the population 
who have and have not heard the message by time t, and k is an activity 
coefficient 

The time path y(t) is obtained by solving the Equation (28) Rewriting 
this in the form 


m 

Xi - y) 


= k 


and using the partial fraction expansion 


-J— = 1 + _L- 

v(l -y) v^l-y 


/ H 0 * + / l -XO* ” I kdl 


In XO - In [l - y(t ) ] = kt + C 

jf ? “ yo when t = 0, then C = lnj> 0 — In (1 — j> 0 ) Solving for y{t), we 
find the time path 


XO = 


1 - ;’o(l ~ eU) 

which is called the logistic emu. (Figure 19) 

As / x, >(r) 1 so that eventually the message will diffuse throughout 

the population w 


FIGURE 19 Logistic Curve 



43S / INTEGRAL CALCULUS 



(b) The region shaded in Figure 20 has total area equal to J since 

[ xe~* 2 dx = hm — 4 / D{e~ x ‘) dx 

lira - le -*'} 1 = lim (- ic~ bl + }) = 
Jo &-» 


FIGURE 20 

f 



Although the shaded region is infinite in width, the height xe decreases 
so rapidly as x increases that the total area is finite 


/; 


i 


(c) / sin xdx ~ hm — cos x 


~ lim (1 — cos b ) 

6— 

But hnv,~*» cos b does not exist so the symbol 


Example 2 For the improper integral 

L# +h r + 1 * “ L 


/ sin x 
Jo 


dx is meaningless 


x 2 + 2x + 5 


dx + 


r * 

Jo X 2 + 2. 


2x + 5 


dx 


we find 


and 


Since 


r 


(* + D= + 2= * - i ,an 1 (-T 1 )! 

1 ft —1 1 , —1 O “l" l\ 

2 \ rl 2-‘ an ~ j 


(•1 , 
l(jw+y* = K ,an " 

hm tan” 1 


fi-f l 


-8 


<2+1 X , . , -1 H 1 x 

— = — = — ■= and hm tan — ~ — = ■= 

2 2 i—* 2 2 


1410 IMPROPER INTEGRALS / 433 



(a) Integrate to obtain the function R(i) (Let Ro be the response 
strength at t = 0 ) 

(b) Find lint(_x RO) 


1410 IMPROPER INTEGRALS 

In some applications, particularly in probability and statistics, integrals 
arise whose limits include co or — oo Examples are 

/, inxdx 

f — dx [ xe~ x2 dx 

Jo x J-« 

Such integrals are called improper However, this should be considered as 
merely a name attached to such objects and not as a comment on their 
logical validity Mathematically, improper integrals are defined as limits of 
ordinary integrals, as follows 

Definition of Improper Integrals Let a be a fixed real number Then 
by the symbol 

f(x)dx 

we mean 

lim J f{x) dx 

provided this limit exists Similarly, 

/__/(*)*- j h f(x)d\ 

/ f(x)dx = f f(x)dx+ f f(x)dx 

= hm / fix) dx + lim f fix) dx 
J b c-.x J o 

proudcd, of course, that the indicated limits exist When an improper 
integral exists it is called conurgent * ► 

Example 1 ( ,) Since D(* In v - x) = In * we have 

/ In x dx ~ lim x In x — *1 
J i J 1 

= lim 6[ln b — I] + 1 « ^ 

•Compare this definition *uh that for convergence of an infinite scries (Section 6 2) 

438 / INTEGRAL CALCULUS 



the value of the integral is 

Lt Tkn*-K-(-|) + iH 

An integral is also called improper when the integrand is not bounded 
within the interval of integration In this case we first restrict the interval 
and then take limits 

Definition 2 (i) If |/(x)| -» co as x -> a, then 

f f(x)dx = lim [ f(x)dx (It > 0) 

Ja A— »0 Ja+h 

00 If l/(t)| — * <z> as x — > b, then 

/ h rb—h 

f(x) dx = lim / f(x)dx (h > 0) 

A— *0 Ja 

(in) If \f(x)\ — * cr as x — * c where a < c < b, then 

/to dx = /to dx + f' /to dx ► 

Example 3 (a) The function f(x)= \/y/x — 1 becomes infinite as 
v — * 1 Hence, to evaluate the integral 


/: 


- dx 


\, — , ‘ h = I™ f - -- - 1 dx Qi > 0) 

V v — 1 A— *0 J I+A y/x — 1 

= lim 2\/x — 1 1 
A-,0 Jl+A 

= lim 2(1 - Vm = 2 
A—O 

(b) The function s(jt) = l/(v - l) 2 becomes infinite as * — > 1 Thus 


/; 


tt dx — lim 


im / 
— o J i 


1 


i+a(x - l) 2 


: lim 

A—O x 




Since this limit docs not exist, the integral has no value (ducrgcs) ► 
440 / INTEGRAL CALCULUS 



the value of the integral is 

L v- + L+ ii A = l[- (- 1) + 1] ° i * 

An integral is also called improper when the integrand is not bounded 
within the interval of integration In this case we first restrict the interval 
and then take limits 

Definition 2 ( 1 ) If |/(x)| — » co as x — * a, then 

f fix) dx = lim [ fix)dx ( h > 0) 

Ja J_o Ja+l 

(n) If |/(v)| —* cr as x — * b, then 

r b r b ~ h 

/ f{x) dx = lim / f(x) dx (// > 0) 

Ja h -> 0 Ja 

(m) If \fix)\ — ► cr as x — ► c where a < c < b, then 

J [ m dx = J fix) dx + J f(x) dx ► 

Example 3 (a) The function fix) = l/V* — 1 becomes infinite as 
x -* 1 Hence, to evaluate the integral 


/, 


i Vx~^l 




/, ® * 01 


= lim 2 vGc — 1 
A -° Ji+a 

=* 2(1 - y/Ji) = 2 


(b) The function g(.r) = l/(v - 1)2 becomes infinite as x — 1 Thus 


[2 . 

I 1 i\ ~ dx = lim f 
J 1 - 0- 


1 


1+a(*- l) 2 

= hm L_T 

A-0 X - lj^ 

lim f 


Jl+A 


= (-> + ';) 

Since this limit docs not exist, the integral has no value (diverges) ► 
440 / INTEGRAL CALCULUS 



PROBLEMS 


In Problems 1-18, evaluate those improper integrals which are convergent 


3. / — </x 
J o v v 


r 12 

5. / esc 6 a 

Jo 


(16 - 


9. / tan 2 1 dt 

Jo 


„ r 

Jo 

i3. r^io 

Jo X 

-fi* 


19. For what values of A is 


20. For what values of a is 


4 S-A 

2o Vi __ Y 

6 -/ u (rbji* 

*• ->->>« e 


10. / e 2 ' rfv 


1, f ~d\ 

J - 

yo vjr+l 

r° 

16 - 

'*-L -*+77 


21. For what values of 6 is 


1410 IMPROPER INTEGRALS / 441 



22. Find the error 



(The integrand is positive, yet the integral is negative') 
23. Use integration by parts to show that if a > 0, 



24. A law of income states that 

>(x) = j (a — l)u~* du (x > 1) 

where x denotes income level, a > 1 is a constant, and y (x) is the propor- 
tion of persons whose incomes exceed x Find the proportion of persons 
falling in the income bracket between 3 and 4 

25. In Example 2 of Section 14 7, find 

/ E(t)dt 
Jo 

26. In Example 5 of Section 14 6, find the present value of a constant yearly 
income A over an infinite period (a perpetuity) 

27. In Example 1 of Section 14 7, find the present value of future savings 
over an infinite period Interpret this result when b > ar 


SUPPLEMENTARY READING 

Stein, S K , Calculus in the First Three Dimensions (McGraw-Hill Book Company, 
New York, 1967), Chapters 1, 6-8, 14, and 21-23 

Thomas. G D . Calculus and Anal) tic Geometry, ed (Addison-Wesley Publishing 

Company, Reading, Massachusetts, 1968), Chapters 5, 6, and 9 


442 / INTEGRAL CALCULUS 



DIFFERENTIAL EQUATIONS 15 


15.1 SEPARATION OF VARIABLES 

Equations involving derivatives are called differential equations To solve a 
differential equation is to find all functions which, when substituted into the 
equation, yield an identity — that is, an expression valid for every value of the 
variable. The key result for determining all solutions of a differential equation 
is Theorem 1 of Section 14 3, which we repeat here for convenience (Compare 
with Theorem 4 in Section 13 4 ) 

Theorem 1 If functions F and G have the same derivative, then there is a 
constant C such that for all x, F(x) ~ G(x) -j- C - ► 

Example 1 In attempting to explain how excitatory tendencies E depend 
on the distance x from a goal, Hull* postulates the differential equation 

DEM = ~ | 

where b is a positive constant Solving , we find 

£(x) — J — ~dx — ~b In x -F C 

If E = E 0 when x = * 0 , then Eq — ~b In -f C, or C = E 0 ~\- b In x 0 
Hence the function E is given by 

E(x) = —b In x -f- Eo + b In x Q = E 0 -f b In (x 0 /x) ► 

•Hull. C. L , “The Goal Gradient Hypothesis Applied to Some Field-Force Problems in 
the Behavior of Young Children,” Psychological Renew 45, 271 (1938) 


151 SEPARATION OF VARIABLES / 443 



Example 2 Difference and differential equations are quite closely related 
and, depending on the approach preferred or the data available, may both 
be used to describe the same behavioral situation For instance, Harrod’s 
model of economic growth (Example 5 of Section 13 1) asserts that 

n< + 1) - no = | no 

where c and g are positive constants and T(/) is national income at time / 
The term Y(t + 1) — Y(t) can be interpreted as the average change 

Y(t +1) - Y(t) 

1 

over a unit time period Harrod’s model assumes this average change to be 
proportional to T(/) itself 

If we assume that investment demands are sensitive to instantaneous 
changes in production (income), and that the instantaneous rate of change 
of income is proportional to income, then the average change over unit time 
is replaced by the limit 

r- ( Q~i,n.- y ( , + 'o- no 

A— »o h 

of the average change over an arbitrary time interval of length h, and the 
basic equation becomes 

no = - g Y(n 

The results of Problem 15 in Section 12 7 imply that YQ) must have the form 
(0 — ke where the constant k is equal to the initial income T(0) This 
compares to the solution y« = y(0)[g/fe - c)]' of the ong.nal difference 
equation (See Section 13 5 ) 


A tec hnique which makes it possible to solve many differential equations 
simply by finding antiderivatives is septa atma of variables A differential 
equation involving a function / W ,s said to have separable variables if it is 
possible to write the equation in the form 


*(/(*))/'(*) + g(x) = 0 

"£“*“ d real , valued The solution of such an equation is 

obtained by finding the antidenvauve of the left side, that is, by writing 


J dx 4- j g(x) dx = 


where C is a constant This yields an equation involving f(x) and a: which 
may be solved for f(x) ' 


444 / DIFFERENTIAL EQUATIONS 



Example 3 To solve the differential equation 
/'(.v) = 

V 1 — A“ 

we first separate the variables to obtain 

/'(*) .. a 

/(.V) Vi - A- 


Writing antiderivatives gives 



or, using Lines 5 and 7 of Table 1, Section 14.3, 

in /(at) = -(1 - JC 2 ) 1 ' 2 + C 


Solving for f(x) yields 

f(x) = he-"-*'"'* 

where k = e c is a constant. 


► 


Example 4 Let N(t) denote the number of individuals in a society at time 
t and let If ft) be the number of inventors in the society at time t Rashevsky* 
proposes 


r(0 = 


JUO 

W) 


as a measure of the technofogical development of the society and assumes 
that 

r'(0 =[A + BN(l)][kr(t) - i> 2 (/)J (1) 


where A , B, k, and L are positive constants with L > k 
Let us suppose an exponential population growth 

N(l) = N 0 e M 


where N 0 is initial population size and X is a growth constant If we separate 
the variables, we can write the differential equation (1) as 


rV) 

krit) - LrHt) 


= A + av„e x ' 


Computing the antiderivative of each side yields 


~k ln — w~- 


At + 


BN„ 


1 + C 


(2) 


•Rashevsky, N , Mathematical Biology of Social Behavior (University of Chicago Press, 
Chicago, 111 , 1951), p 215 


151 SEPARATION OF VARIABLES / 445 



wtee C is a constant. Thus the solution for r(t) is 

r(0 = ITP™ 7 ' 

where *«) denotes the right stde of (2) Lettmg , become large, we find 
Inn r(j) — k/L 


> 


P ROBLEMS 

Solve the differential equations in Problems 1-10. 

1. fit) = 3 //(/) 


_ dy . 2 
2. = 2 xy 


, dy 


' dx 
4 .fix) 


f\x)_±l 


_ d xy + y 
‘ dx ° x + xy 

1 . 2 .* + ,- 0 



9. y 2 - 


4ffy 
r ‘ 


10. sin 2 y + cos 2 x ^ = 0 


In Problems 11-13, find the particular solution of the differential equation 
which satisfies the given initial condition. 

11. 2 xf'ix) + fix) = 0, /(l) = 2 


4 

d> 

ds sin 6 + e 2 ' sin 


dy 

12. 4 y cos x + 3 sin x ^ =* 0; y = 2 when x - 


Mil PTC Sill p _ , „ , 

— — — — . s = 0 when 0 = A 7r 

2e* + e* cos 20 * 2 

14. Deese* proposed the differential equation dR/dN — A(M — R), where 
-R denotes response strength, N is the number of reinforcements, M is the 
fixed maximum value of R, and A is a constant Show that if R - 0 when 
N = 0, the learning curve is R = Mi 1 — e~ kN ). 

15. Show that by making the substitution y = lx the differential equation 

becomes an equation with separable variables {Hint If > = ^x , then 
y' = o H~ xv ' ) 

16. Using the result of Problem 15, solve 

dx x x 

(b) ilx - y) + i4x - 2 y)/ = 0 


•Deese, I , The Psychology of Learning (McGraw-Hill Book Company, New York, 1952) 

446 / DIFFERENTIAL EQUATIONS 



15.2 LINEAR DIFFERENTIAL EQUATIONS 


The results obtained in Chapter 12 show that the derivative operator D has 
properties which parallel those listed in Properties 1-5 of Section 13.2 for the 
difference operator A. Specifically, we know that: 

(i) Operations of taking derivatives commute with each other and with 
the operation of multiplying by a constant. That is, for any positive 
integers k and r, 

D l [Df(x)] = D'i&fCx ) ] « D l+r f(x) 
and, if c is a constant, 

D l WU)\ = cD l /(x) 

(ii) The operation of differentiation distributes over sums of functions . 
That is, 

D l [f t ( X ) +MX) + • • ’ + /«(*)] 

- + D%(X) H F D k Ux) 

(iii) If we define the polynomial derivative operator 

= c 0 + C X D -f C 2 D 2 n F c n D n 

by 

4n(D)f(x) = co/(x) + c x Df(x) + c 2 Z> 2 /(x) + ■ • * + c n Z)yW 
then 

<i> n (D)[af(x)} = 

A.CWCO + g(x)l - *.( WW1 + a.(0)fe(*)] 

If = a 0 -h aj /) + ••• -p is another such operator, 

then 

ten(D) + MD)]/(X) = -F *m(D)[f(x)J 

= X «£))][/(.x)] 

the symbol X denoting ordinary multiplication of polynomials. 

A differential equation is called linear if it has the form 

- g(x) 

where $(D) is a polynomial derisathe operator, g is a known function, and 
/ is a function for which a solution is sought. It follows from the abo\e 
discussion that all techniques employed in solving linear difference equations, 
such as factoring the operator polynomial and the method of complementary 
and particular solutions (see Sections 13.4-13.6), apply equally u ell to solving 


15.2 LINEAR DIFFERENTIAL EQUATIONS / 447 



where C ,s a constant. Thus the solutton for r{ l) is 
_ k 

r(0 " l + 

where *(,) denotes the nght s.de of (2) Lettmg « become large, we find 
lim r(/) = k/L 


PROBLEMS 

Solve the differential equations in Problems 1-10 

1. /'(:) - 3 if 0) 


s dy 


Z.~- Zx‘y 
dx 


4. fix) 


fix) + 1 


5 & = xy + y 
‘ dx = X + xy 

7. 2 x^f + y - 0 
dx 

o 

* y dx 


6 x »_ *_ 9 ,o 

dy 

8. Ixf'ix) + Ax) “fix) 
0 


2 2 dy 

10. sin y + cos x 


In Problems 11-13, find the particular solution of the differential equation 
which satisfies the given initial condition 

11. 2 xf'(x) + fix) = 0, /(l) = 2 

12. 4y cos x + 3 smx^ = 0, = 2 when x => 

,, * sin 0 + e 2 * sin 6 „ , „ , 

dd 2e‘ + e* cos 26 2 

14. Deese* proposed the differential equation dRfdN — k ( M — R), where 
R denotes response strength, N is the number of reinforcements, M is the 
fixed maximum value of R, and A: is a constant Show that if R = 0 when 
N — 0, the learning curve is it = M(1 — e~ kN ) 

15. Show that by making the substitution y = ix the differential equation 

i-°© 

becomes an equation with separable variables ( Hint If y = ox, then 
y' = v + jcu' ) 

16. Using the result of Problem 15, solve 

(a) + cos 2 - 

dx X X 

(b) {2x - y) + (Ax - 2y)y' = 0 


•Deese, J , The Psychology of Learning (McGraw Hill Book Company, New York, 1952) 

446 / DIFFERENTIAL EQUATIONS 



15 2 LINEAR DIFFERENTIAL EQUATIONS 


The results obtained in Chapter 12 show that the derivative operator D has 
properties which parallel those listed in Properties 1-5 of Section 13 2 for the 
difference operator A Specifically, we know that 

(i) Operations of taking derivatives commute with each other and with 
the operation of multiplying by a constant That is, for any positive 
integers k and i, 

D k [DJ(x)\ = D'[D l f(x)] = D tJ >f(x) 
and, if £ is a constant, 

D l [c/(x)] = cD‘/(x) 

(u) The operation of differentiation distributes over sums of functions 
That is, 

D k [fi(x) -f/ 2 (x) + +/»(*)] 

- D k Mx) + D%(x) + + JD'fXx) 

(in) If we define the polynomial derivative operator 

4n(D) = Co *f C X D 4- c 2 D 2 -f 4- c n D n 
by 

<t>n(.D)f(x) = c 0 f(x) + C X Df(x) 4- c 2 D 2 f(x) + -f- c n D n f(x) 

then 

<p n (D)[af(x)] = a* n (D)[f(x)] 

4n(D)[f(x) + g(x)l = *n(D)lf(x)] + 4n(0)[g(x)] 

If 4 m (D) = a 0 -f- a\D 4* 4- a m D m is another such operator, 

then 

“ MWfix)] 4- **( W(*)l 

4n(D)ty m (D)f(x)) = MD)te n (D)f(x)] 

= l*n(D) X ] 

the symbol X denoting ordinary multiplication of polynomials 
A differential equation is called linear if it has the form 
4 (D)f(x) = g(x) 

where 4(D) is a polynomial derivative operator, g is a known function, and 
/ is a function for which a solution is sought It follows from the above 
discussion that all uchnnjuis * mplo)id in solving Umar dtjjlnnc*. e<< unions^ 
such as factoring the operator polynomial and the method of complementary 
and particular solutions (sec Sections 13 4- It 6), apfl) t dually we// to solving 

15* LINEAR DIFFERENTIAL EQUATIONS / **7 



lineal diffaenttal equations The only change lies in the specific form of the 
solution of the equation 

(D - affix) = 0 

which arises, as in difference equations, from factoring the operator poly- 
nomial and which forms the basis for writing down by inspection the solution 
to any linear equation 

The following three theorems, which parallel Theorems 6, 7, and 8 tor 
differences (Sections 13 4 and 13 5), provide the necessary tools 

Theorem 2 A function /is a solution of the differential equation 

D L f(x) = 0 

if and only if/ is a polynomial of degree k — 1 or less, that is, if and only 
if / has the form 

/(*) = Co + CiX + c 2 x 2 + + a-,**"* 1 ( 3 ) 

where c 0t c,, , c,_j are constants 

PROOF Line 2 of Table 1, Section 14 3, constitutes a proof for k = 1 
Assume now that the statement of the theorem is true when k = ; and 
consider the (r + l)st order equation D r+l f(x) = 0 The substitution 
g(x) = Df(x) reduces the order and gives the equation D T g(x) = 0 The 
inductive assumption states that g(x) has the form 

g(x) a 0 + a ix + a 2 x 2 + + a r -iX T ~ l 

where a Q a u a 2 , , a r _, are constants In terms of/, this means that 

D f(x) = g(x) = a 0 -f a ,x + + ov_i* r-1 

Computing antiderivatives (using Lines 4 and 10 of Table 1, Section 143) 
gives 

/(*) = + ^x 2 + + C 

which has the form of (3) with k = i -j- \ ^ 

Example 1 Let Y(t) be the number of out migrants from a community 
by time t If D z Y(t) = 0, Theorem 2 requires that Y(t) have the form 
T(/) — Cq + c,f + C 2 / 2 , where cq, c,, and C 2 are constants 
This solution is easily verified by noting that DY(t) = c, + 2c a f, 
D T(r) = 2 c 2 , and hence that D 3 Y(t ) = 0 ► 

Theorem 3 For all positive integers n and k, 

D k [e~ az f(x)] = e~ ax (D - af[f( x )] (4) 

448 / DIFFERENTIAL EQUATIONS 



PROOF Again nc proceed by induction For A = 1, differentiating the 
product e~ ai f(x) gives 


D[e- a y(v)] = c~‘*Df{x) - af(x)c~ a * 

= e-“(B - a)[f(x)] { 5) 


If (4) is valid for A = i , then 

B' + '[v _oi /(a)] 

= D{D'le-"f(x)}} 

« D[e~ az (D — /J)y'(x)] [by the inductive assumption] 

= C~ ax (D — fl)[(D — a)f(x)\ [using (5) with (0 - a)?(x) 
in place of /(*)] 

= e- aI (D - oy+'t/W) 


► 


Theorem 4 All solutions of the differenlml equation 
(D - affix) = 0 

have the form 

/(*) - (c 0 4- c x x + + cjl- i x l ~' 1 )e ax 

where Co, c if , c*_i are constants 

PROOF Since e~ ax can never be zero, it follows that (D — a) k f(x ) — 0 is 
equivalent to e~ ax (D — affix) * 0 In turn, this latter expression is, ac- 
cording to Theorem 3, the same as D k {e~ az f( x)] = 0 Applying Theorem 2 
gives 

e~ az f(x) - c 0 + Cjx 4- 4- cl-i**" 1 

from which the result follows upon multiplying both sides by e ax ► 


The various techniques employed m solving linear difference equations 
may now be adapted to linear differential equations, as in the following 
examples The only change is the use of Theorem 4 in place of Theorem 8, 
Section 13 5, since we now deal with derivatives rather than differences 


Example 2 Individuals in a certain population of size N may, at any time 
t, be either in state A or in state B Let B(l ) denote the number m state B at 
time u and suppose that during a time interval (/, t 4- h) of length /i, a 
proportion ah 4- o(/j) of these individuals shift to state A (Here o(h) denotes 
a quantity such that lim*-.© [o(Ji)/li\ = 0, and hence which is negligible 
for small values of h ) Then the number B(t 4- h) of individuals m state B at 
time (/ 4- h) is given by 

B(t + h) = BO) - BiOiak 4- o(/i)] 

152 LINEAR DIFFERENTIAL EQUATIONS / 449 



The average rate of change in the interval is 

Passing to the limit we find 
m o-lun 

Rewriting this equation as ( D + a)B(t) = 0 and applying Theorem 4 
yields Bit) = ce -a ‘, where the constant c is the initial number /?( 0) of 
individuals in state B Note that Bit) — ► 0 as / — » co ^ 

Example 3 The equation /"'(x ) + 3 /"(x) = 4 fix) may be written in 
operator form as (D 3 + 3 D 2 — 4 )/(x) = 0 or, m factored form, as 

(D - 1)(Z) + 2 ) 2 /(x) = 0 (6) 

The separate equations ( D — l)/(x) = 0 and (D + 2 ) 2 /(x) = 0 have 
solutions c and (c 2 + c 3 x)e -2x , respectively By analogy to the argument 
used for difference equations, this means that the complete solution of 
(6) is 

f(x) = Cl e x + (c 2 + c 3 x)e 2x ► 

Example 4 To solve the equation 

(D - 2 )(D - 1 )/(x) = 6 + 2x (7) 

we first take two derivatives, annihilating the right side and obtaining 

D\D - 2 )(Z> - 1 )/(x) = D 2 [ 6 + 2x] = 0 (8) 

Every solution of (8) has the form f( x ) = c 0 + c T x + c 2 e 2x + c 3 e x , of 
which the terms c 2 c“ x + c^e* form the complementary solution of (7), 
while c 0 and ci must be determined in order to find the particular solution 
of the form c 0 -f Cix 

The equation (D ~ 2 )(Z> - l)[c 0 + c lX ]= 6 + 2x requires 
2c 0 — 3cj -f 2c jx = 6 + 2x 

Hence c 0 = l and Cj = 1, and the complete solution of (7) is 

fix) = c 2 e 2x + c 3 e x + l + x ► 


PROBLEMS 

1 Verify the computations for co and ci in Example 4 

450 / DIFFERENTIAL EQUATIONS 



Example 5 In attempting to quantify Miller’s theory of conflict, Anderson* 
assumes that dx/dt — G(x) — F(x) where, for a goal seeking system, x(t) 
denotes the distance from the goal at time t, G(x ) is the avoidance gradient, 
and F(x) the approach gradient It is assumed that the avoidance and 
approach gradients have the forms 

G(a) = a(x 0 — x) 4 - b and F(x) = c(x 0 — x) 4- d (9) 


where the positive constants c and a represent reward and punishment 
parameters, respectively, Xq =* x(0) is the initial distance from the goal, and 
b and d are parameters reflecting the experimental situation 
Substituting Equations (9) into the original differential equation gives 
dx/dt = (c — a)x + (a — c)xo -f b — d or, in operator notation. 


[D -f (a - c)MO = (a ~ c)x 0 + (b - d) (10) 


Since the right side of (10) is constant, we have D[D + (a — c)]x(/) = 0, 
giving a general solution form x(t) = Cq 4- c 1 e” (a ~‘ c)t The particular 
solution Co, obtained by substituting into (10), is 


Co = 



This gives 

*(o = *» + 


from which, since x(0) = v 0 , we find c I = — (b — d)/(a — c) 
plete solution is 


x(t) - Xo + 


b - d 


U-<r 


The com- 


If a > c, then lim,_» x{t ) = x 0 + (b — d)/(a — c), while 
hm |x(r)J = oo if c > a 


We conclude that the system has an equilibrium point x 0 + (b — d)/(a ~ c) 
if and only if a > c, that is, if and only if the punishment parameter exceeds 
the reward parameter 

Since the system is unchanging at equilibrium, the equilibrium state may 
also be obtained by setting dx/dt — 0 This means that F(x) = G(x) or, 
solving for x in aix Q — x) + b = e(* 0 — x) -f d, that 


as before Although this formulation obscures the requirement a > c, it 
does make dear that at equilibrium, the positive effect of the approach 
gradient must exactly balance the negative effect of the avoidance gradient ► 

•Anderson, N H , On the Quantification of Miller s Conflict Theory,” Psychological 
Renew 69, 400-414 (1962) 


152 LINEAR DIFFERENTIAL EQUATIONS / 451 



Example 6 The operator (D - a) 1 serves as an anmh.lator for f 
(Theorem 4) To solve the equation / (a) + / (x) 6 f(x) x + 

or, in operator form, 


(JO + 3 )(D - 2 )/(x) = ve 21 + 2* a 
we first apply the operator (D - 2 ?D 3 to both stdes to obtain 
(D + 3 )(fl - 2 ) 3 D 3 /(x) = 0 




Solutions of this latter equation have the form 

/(x) = c t e” 3x + (c 2 + c 3 x -f- c A x 2 )e 2z + c 5 + cqx + c 7 x‘ 

The terms CiC -3x + c 2 e 2 * constitute the complementary solution leaving 
(c 3 x + c 4 x 2 )<? 2x + c & + c G x + c 7 x 2 as the form of the particular solution 
Substituting the particular solution into (11) and equating coefficients of 
like functions gives c 3 — c 4 = Vo* cs = — 3T« c c — "“$> an 

c 7 = — $ Thus the complete solution of (11) is 

/(x) = cie~ 3x + cue 2 * + (— + iV**^ 2 * ~ ~ l x ~ J x ‘ ^ 


PROBLEMS 

2 Verify the computations for Example 6 

3 Show that the differential equation D 2 &(0 — 6 2 g(0 = /(/) has a com- 
plementary solution of the form coe _6< + cie bl 

In Problems 4 8, solve the differential equation subject to the given initial 
conditions 

4 D 2 >(0 - 4 Dy(f) + 4> (/) = 0, >(0) = 6, j(l) = e 2 

5 KDxQ) + 6x(0 - / 2 , x(0) - 1 

6 D 2 f(f) - 6D/(0 + 5/(r) = 6, J( 0) = 4, 0/(0) = 10 

7 6g"(*) - 36g'(*) + 54g(jc) = \lx, g(0) = 1, Dg(0) - 8 

8 D 2 /i(0 = 3O/i(0 + 12, /i(0) - 3, O/i(0) = 10 

Find particular solutions for the differential equations in Problems 9-1 1 

9 O 2 /(0 - 3B/(r) + 2/(0 - / 2 e 2 ‘ = 0 

10 D\(0 + 2Dx{t) — 3x(0 + e' sin t ( Hint Assume a particular solu 

lion of the form (co sin t + ci cos t)e ‘ ) 

11 D 2 g(t) + 8Dg(0 + 16g(0 - e u cos2f ( Hint Assume a particular 
solution of the form (co cos 2/ + o sin 20e 4 * ) 

12 In a model of buying behavior, Longton* assumes that a customer’s 
decision to buy a brand item depends upon the image the customer has 

♦Longton, P A , ‘Mathematics Models, and Marketing ’ Human Relations 18, 289-296 
(1965) 


452 / DIFFERENTIAL EQUATIONS 



of the brand The image is assumed to be composed of a set S of N 
elements which come to be associated with the brand in an all-or-none 
manner (No partial associations are allowed ) An element once asso- 
ciated may also be disassociated from the brand 
Let X/i be the proportion of disassociated elements that become asso- 
ciated with the brand during time h, ph be the proportion of associated 
elements that become disassociated from the brand during time //, and 
N\(t) be the number of dements associated with the brand at time t 
If pit) = N\ (t)/N is the proportion of the set of total elements which 
are associated with the brand at time t, 

(a) show that 

pO) = r^-tl - + pWe-^' 

ATP 

(b) find the equilibrium state and indicate whether it may be obtained 
by setting the derivative p’O) equal to zero 

13. In a study of quantal absorption in a visual receptor, Cornsweetf assumes 
that the net rate of change of regenerated molecules in the receptor is 

dN 

~ = K(M - N) - AQN 

where AT, M, A, and Q are constants Find NO) assuming that N(0) = M 

14. The demand and supply functions for a certain product are, respectively, 

D(t) = a -F bPO) + cDPO) (a > 0, b < 0) 

5(0 = a + 0P( /) (a < 0, 0 > 0) 

where P0) is the price at time t. and a, b , c, a, and 0 are constants 

(a) Solve the differential equation for P{i) which resvAVs if we assume 
that DO) = SO) 

(b) Determine relations among the constants which lead to an equilib- 
rium state for the system and determine the equilibrium state which 
obtains 

15. Modify Problem 14 by assuming that 

DO) = a + bPO) + cDPO) + dD-PO) (« > 0, b < 0) 

SO) = a + &P0) + T DPO) + 5D ’ PO ) (a < 0, 0 > 0) 

Again assuming DO) ~ SO), find the time path for price (that is, solve 
for PO)) when 

'»(«)'>*(«) 

In each case*, determine* necessary and sutlicient conditions for j stable 
equilibrium 

tCornswixt, T N , * Changes in die Apjxarance of Stimuli of Very llith Luminance,*’ 
Piycholo^icut Rcctc* 69, 257-273 (1962) 


152 LINEAR DIFFERENTIAL EQUATIONS / 453 



36. The Domar burden-of*debt model* assumes that 
D'O) = «Y(/) 0 < a < 1 

where DO) denotes the total outstanding public debt and Y(t) the 
national income at time t Given that the interest rate i is constant, 


TO) = iD(t) 

where T(t) is the interest induced tax Income is assumed to grow at a 
constant relative rate {2 so that 

Y'(t) = 0Y(f) 0 < 0 < 1 


The burden of debt B(t) is therefore defined as 


BO) 


TQ) 

m 


DO) 

' no 


(a) Find the time path for BO) 

(b) Determine the equilibrium state for the burden of debt 


17. A system consists of two sequential stages Let V\ 0 ) and V2O) denote 
the respective outputs at time t of stages 1 and 2 If DV i(/) *= aViO) anc * 
cDV 2 O) -f \y?,0)/R\ = $V\U) where c, R,a, and /3 are constants, find 
V2O) 


SUPPLEMENTARY READING 

aldington, E A , An Introduction to Ordinary Differential Equations (Prentice- 
Hall, Inc , Englewood Cliffs, N J , 1961) 




454 / DIFFERENTIAL EQUATIONS 



POWER SERIES 


16 


16.1 TAYLOR SERIES 

/■co/»Hr\n r? 91 states that if a function f has n 

zzttsr .. »»- 1 ».». »=■• 

exists a point c between a and x at which 


/"(a) 

/(y) = /(a) +f'(a)(x - a) + - j - 


+ 


oT^T7 


(x — o)“ 1 + 


(x - a ) 2 + 
/- (n) (c)(x - a) n 


it' 


(D 


,, , /-(.-'Vo) are constants, and the 

For fixed a, the numbers f(a),)(h -J of approximat , n g the 

Taylor Formula (1) may be thoug list-degree Taylor 

value of /(v) by calculating the following („- l)st degree lay 

polynomial ^ 


_,(.v) = /(a) + /'( a X v ~ a) + +(«-!)' 


‘ (x - a)"' 


( 2 ) 


The term 


f 1,0 (c) 

ll< 


(\ - «)“ 


111 (1) indicates the magnitude of the* error 


made in this approximation 


Example 1 Since e° 
succcssinc polynomial 


, l and all demands of /(v) = c* are also o', the 
ipproximations to e* obtained by using a = 0 in 

161 TAYLOR SERIES / 455 



(2) arc 


P M - c° = 1 
Pit'c) = e° -f e°x = I 4- X 

p (v) = e° + c°x + ^2’ = * X ~2 

r.(’t) = l + x + ^ r +^ T + +^7 

Figure 1 shows the function l* and the Taylor polynomial approximations 
for « - 0, 1 2 and 3 Each of the polynomials passes through the functional 
value l° =* 1 when x = 0 For values of x near zero, the higher degree 
poI>nomials provide the better approximations However, although this 
situation holds for positive x, it docs not hold for all negative x In fact, the 
first approximation p 0 (*) “ 1 is the best of those pictured for large negative 
values of x ^ 

FIGURE 1 Successive Taylor s polynomial approximations to /(x) =• e 





m semes 




called the Taylor senes of the function f, taken about the point a It is apparent 
from (1) that this senes converges to /(a) if and only if the sequence of 
remainder, or error, terms [/ (,i, (c)///'](a — a) n converges to zero Luckily, 
this is the case for the functions of particular interest to us 

Example 1 ' For the exponential function /(x) = e x , the Taylor series 
about the point zero (that is, a = 0) is 

e x — l + \ + + + 


The error term for the nth approximation is (l/n [ )e e x n , which cannot exceed 
(l /n*)e x x n when x is positive, or (l/n t )e°\x\ n = (1/« , )|*|" when a: is negative 
For each fixed x, the sequence of error terms converges to zero and it follows 
that (4) does indeed represent e x for all values of x ► 

Example 2 For /(x) = sin x, the Taylor series of / about the point zero 
is determined by computing successive derivatives at zero Since sin 0 = 0, 
cos 0 = 1, D sin x — cos x, and D cos x — —sin r, we have 

/( 0) = sin 0 = 0 
/'(0) = cosO = I 
f"( 0) = —sin 0 = 0 
/'"(0) = -cosO = -1 


In general, f (n) ( 0) = 0 if it is even, / (n> (0) = 1 for n = 1, 5, 9, 13, , 

and / {n) (0) = —1 for n — 3, 7, 11, 15, Thus the Taylor senes for 
sin a: is 


SC 

1=0 


3' + 5' 
I) l+I 


7' T 9 
x° l+ ‘ 

(2 k + 1)> 


(5) 


The fact that the sine function is an odd function (that is, sin (—a) = —sin x) 
is reflected in the appearance of only odd powers of a m the series (5) 

The error term for the ;ith approximation has the form 

f (n) (c) x1l 

n 1 


where f ln) (c) is either ±sin c or dbcos c and c is between 0 and a In cither 
case, if M (c)\ < I and the magnitude of the error term cannot exceed 


161 TAYLOR SERIES / 457 



2)Jt - M*) = ? n = ' 

Pi(') = e° + ‘■° x = x 

/ , w _ t o 4 . A + A! = 1 + x + x_ 

x 3 x" 

P„(v) = l+ ^+ 2 T + 3t+ + 7 f 

Figure 1 sho v s the function t' tnd the Taylor polynomial a PP roxl ™‘'°“ 

for/i = 0 1,2 and 3 fcach of the polynomials passes through the func 

value t° = 1 when v = 0 For values of x near zero, the higher degr 
polynomials provide the better approximations However, although 
situation holds for positive v, it does not hold for all negative x In ac » 
first approximation po(x) = 1 is the best of those pictured for large nega 
values of x 


FIGURE 1 Successivo Taylor s polynomial approximations to /(x) ** 0 
} 



*Hi “-1/ ’ wwia tS. function / itvclf 


* U / POWER SERIES 


(3) 



Taylor series for the other trigonometric functions may be determined m a 
similar fashion However, they are somewhat messy, and it is usually easier 
to proceed by computing appropriate values of the sine and cosine functions 
from (5) and (6) and then determining the other desired values from these 
values 

Example 4 A Taylor series about the point zero is called a McLaurin 
series * We cannot write a McLaurin senes for the natural logarithm function 
since In 0 does not exist In this case the most convenient point about which 
to write a Taylor series is a = I, since In 1 = 0 
The successive derivatives of In x are 

Z) In jc = - = x~ l 
x 

D 2 Inx = Dx~ x = -aT 2 
D 2 \n x ~ Z?(-aT 2 ) = 2x~ 3 


Hence the Taylor series for the logarithm function, taken about x = 1, is 


ln^ 




(*- 1 ) 

= £ (-1)' 
fc=I 


2 ' 3 4 

,t-l (X- If 


(V 


We leave it to the reader to show that this series converges only for 0 < x < 2 
A more commonly used form of (7), obtained by substituting 1 -f- y for 
x, is 


. .. , , y , y _ y 
in (i + y) = y - t + a 


+ 


2 ' 3 4 

which converges for — 1 < y < 1 Replacing y by -y in (8) gives 

y 2 y 3 y 4 

ln(i -y)= -j’-y- 

again converging for — 1 < y < 1 Combining (8) and (9) yields 
In = In (1 + y) - In (I - y) 

= 2 (>’ + J + } T+ ) 

which converges for — 1 < y < 1 

•After Colm McLaurin (1698-1746) a student of Sir fsaac Newton 


( 8 ) 


( 9 ) 


161 TAYLOB SERIES / 459 



|x»/n'l Since (|x’ !/«') converts to zero as n - Jt follows 
real number x, the Taylor senes (5) represents sin x 

Using standard trigonometric identities, values of sm x can be determ 
for anv x if we know values for x between -x/2 and x/2 In this range, t 
senes (5) provides a good approximation with relatively few erms 
instance, using the first two nonzero terms of (5) w.th x = x/6 gives sin e/6 
approximately as 


Since jr/6 « 3 1416/6 = 0 5236, our approximation is 


sin ~ « (0 5236) - 
0 


(0 5236) 3 


: 0 5236 - 0 0239 
= 0 4997 


to four decimal places 

Since we know sin v/6 = the actual error in this approximation 
easily computed as £ — 0 4997 = 0 0003 However, in a real application 
the Taylor series, the true value would be unknown and it would be neceSS3 an 
to estimate the magnitude of the error In the present case, since (5) i s 
alternating series, the next term (l/5»)0r/6) 5 = 0 0003 provides an upp^ 
bound on the possible error and shows that our estimate 0 4997 is corre 
to at least three decimal places 


PROBLEMS 

1 Use the first three terms of the senes (5) to estimate sm 75 U se 1 
fourth term to provide an upper bound on the error 

2. Same as Problem 1 for sm (-2ir/5) = sin (-72°) 

3 Same as Problem 1 for sm (21ir/5) (Hint Compute in terms of an 
angle between -t/2 and tt/2 ) 

4 Same as Problem 1 for sm (-12jr/7) 


Example 3 By an argument similar to that of Example 2, we find 

. jc 2 x 4 x G 

cosx = 1 — ~ + ~ l 

2' ^ 4' 6' + 

* , v 2t (6) 

o(c,?2 L , lhc Hr,r,ca "°" ° f Ibis formula, mclud.ng the fact that the “‘•““’f 
ernis con\ erges to zero, as an exercise (Problem 29) 

453 1 power series 



so 



and we obtain the familiar geometric series 

1 — = V 

i - a h, 


► 


Example 6 Taylor’s Theorem has been used by Sidman* to show that if 
in a group of N persons, individual learning curves have the form 

’ N) 

then the average v 

70 ) = ^ £ T.CO 

»-=i 

, - „ -nrA _ 1 _ e -“ unless all the con- 

cannot possibly have the same form y(0 
slants k, are equal (to k ) Since the Taylor series for y,( t) 

_ Ll . , _ (fc.o! . _ 

y,(/) = 1 — e * - ktt 2 f + 3* 

we must have 

m - j, £ y-o> + ,| - 

1=1 

On the other hand, if 7(0 = 1 ~ e ’ l ^ en 


7(0 




Equating coefficients of the first two powers of I gives 

= and *- = TV 2 *■ 

Together, these equations require 


N 

But it is easily verified that 
1 




( 12 ) 


Tv i <*■ = ^ (s A ‘) 

where T = (1 /mJeV-. A. Hence, (12) holds ,f and only « k. - k for 
all ; (See Problem 14(b), Section 6 1) 

, c r-i.cml R.lauons Obtained from Group Data,” Psycholog,- 

•Sidman, M , “A Note on Functional Kuauons w 

cal Built im 49, 263-269 (1952) 

161 TAYLOR SERIES / 461 



Example 5 Binomial Expansions 
positive integer, then 

(1 + O’* 


In Section 7 3 we found that if « is 



( 10 ) 


The binomial coefficient 



is defined by 

fn\ (n\ n(n - l)(n -2) (n - k + 1) 

\k) = TT “ 

and has a counting interpretation as the number of different subsets (samples) 
of size k which may be chosen from a set of n objects 

When H is not a positive integer, no such counting interpretation is possible 
and our original proof of Formula (10) is no longer valid However a 
McLaurin series for /(!) = (1 + 0” is easily obtained regardless of the 
value of n We have 

/( 0 ) = (1 + 0 " l.-o = (1 + 0 )” = 1 “ = 1 
/'( 0) = »( 1 + li o-» 
f"(f» - n(n - 1)(1 + l)” -2 l,-o - «(« - 1) 
and, by induction, for k = 0, 1 , 2, , 


/ <fc> (0) = n(n - 1) (n - k + 1) = (»)i 


Note that this holds whether oi not n is a positive integei Thus for any 
value whatsoever of n, we find, using Equation (3), that 


(1 + ,)» - , + ,2 + •«L=_Dl!L^21 l ° + 




(ID 


In particular, since 




when n is an integer and k > n, Formula (10) is merely a special case of (1 1) 
We shall make great use of (1 1) in later sections Here we consider only 
one particular case When n ~ — 1 and / = ~x in (11), we obtain 


(-l)r (-1X— 2)(-3) (-M f-11 k k< , ,,t 

\k )- ~k< k< - —k > (_1) 


460 / POWER SERIES 



29. Complete the details of Example 3 

30. Complete the details of Example 4 


16.2 DIFFERENTIATION AND INTEGRATION 
OF POWER SERIES 

A senes of the form ^ 

C(x) = Co + c,tc + c 2 x 2 + c 3 x 3 + = 52 c ’‘ x " (13) 

where c„ e, c 2 are constants, is called a power senes Every Taylor 
series about zero, whether or not it converges to the values of the function 
from which it was generated, is such a power series 
The series (13) converges at x = 0, since C(0) = c 0 It may also con- 
verge for values of x other than zero 

Example 1 (a) Applying the ratio test (Section 6 4) to the series 


,x , 1 

* + T + 3 + 4 + 


( 14 ) 


leads to 


lim 


1 x n +1 /Qt + 1) 
x n /n 


‘I = ^liTTT^ “ W 


Thus this series converges absolutely if M < 1 'sUltely 

When |x|=l, the ratio test fails and we must investigate each case separately 

For x = -1 the series is a convergent alternating series 

-l + J- S + i- 

while for x = 1, the senes is 

l -T i + 3 + 4 “1* 

the divergent harmonic senes Thus the series (14) converges when 
- 1 < x < 1 and diverges outside this interval 

(b) The Taylor senes for e*. sin *, and cos * (Examples 1-3 of Section 16 1) 
converge for all real x 

(c) The ratio test applied to the series 

1 4. v + 2W 2 4- 3'x 3 + 

leads to the limit 

which exists only if * == 0 

,62 DIFFERENTIATION AND INTEGRATION / 462 



PROBLEMS 

In Problems 5-14 find the Taylor senes of the md.cated funct.on m powers 
of x - a where a is as indicated 
5 6 + 3x + 4x 2 + 2x 3 + x 4 , a = 0 

6. 6 + 3x + 4x 2 + 2x 3 + x 4 , a = 1 

7. 2\ <i = 0 

8. cosx, a = x/3 



10. Vx, a = 4 

11 , sm — , a = 0 (Expand sin x and then divide by x ) 


12. xe -2 *, a = 0 



14. e"* 2 , a = 0 

In Problems 15-20, write the nth-degree Taylor polynomial around a *= 0 
for the given function and value of n and indicate the form of the remain er 
(error) term 

15. e~ x , n = 4 16. In (1 - x 2 ), n = 4 

17. sin -1 x, n = 4 18. tan -1 x, n = 5 

19. \/l + x, n = 5 20. 1/(1 + e*), n = 4 

21. In Problem 19, use the polynomial you obtained to estimate \/T05 

Give an upper bound for the error in your estimate 

22. Same as Problem 21 for \/0 90 

23. Same as Problem 21 for sin -1 \ (Use Problem 17 ) Compare your results 
with the true value tt/ 6 = sin -1 \ 

In Problems 24-27, use the Taylor Formula (1) to find the indicated quantity 
correct to three decimal places 

24. In 1 05 (Choose a = 1 ) 

25. e 2 2 (Assume e 2 — 1 3891 is known What problems do you en- 
counter if you take a = 0 7 ) 

26. sm 92° 27. cos 48° 

28. Define a function / by /(0) = 0 and /(x) = e -1/l2 when x p 5 0 Show 
that the McLaurin senes for / has only zero coefficients Hence this 
series does not represent the function except when x = 0 


462 / POWER SERIES 


We see that for every power series, there is an interval inside of which the 
senes converges and outside of which the senes diverges As suggested by 
Example l, the ratio test constitutes the most useful tool for determining the 
interval of convergence of a power senes 
The particular importance of the interval of convergence lies in the fact 
that within this interval the usual operations that one may perform on poly- 
nomials, such as term-by-term integration and differentiation, apply equally 
Well to power series The next three theorems state the key results We shall 
omit the proofs 

Theorem 2 Inside its interval of convergence, a power series is con- 
tinuous ► 

Theorem 3 Inside its interval of convergence, a power series may be 
integrated term by term and the resulting series is the integral of the original 
series Except possibly for the endpoints, the integral series has the same 
interval of convergence as the original series ► 

Theorem 4 Inside its interval of convergence, a power series may be dif- 
ferentiated term by term and the resulting series is the derivative of the 
original series Except possibly for the endpoints, the derivative series has 
the same interval of convergence as the original series ► 


Example 2 By Taylor’s theorem, 

, x 2 , x 4 x° , , (— l)*v 2rt , 

C0SAT= 1 - 2r +4 r -6r+ + ~(2ny- + 

Term-by-term differentiation of the series yields 
—sin a = v T 3 7 — + 

which agrees with the results obtained m Example 2 of Section 16 1 
Example 3 The definite integral 


r^jo 

Jo 0 

nuy be oahuted b> the use of Scrips as follows 
pansion of sm 0, we luxe 

0 1 3' + 5' ' 


ITom the MeLaurm c\- 

(i< - jy ' r ■ 


162 DIFFERENTIATION AND INTEGRATION / 465 



ln rac „ P . irt of Example!, tbc set of *3. 

of convergence ,s typ.cal of aU 

P TSgate the deta.ls of convergence, we first define the »*■£ 
convergence of a power series C(x) == c 0 + ci* + c 2 x + 
supremum of the absolute values of those at for winch the senes converge 
Thus, m Example 1 the radii of convergence for the senes in parts (a), W, 
and (c) are, respectively, 1, and 0 The justification for the term radius 
convergence is contained in the following theorem 

Theorem 1 Let R denote the radius of convergence of the series 
C(x) = c 0 + CiX + c 2 x 2 + 


= c„* n 

n 0 

Then C(x) converges absolutely for all x such that \x\ < R and diverges if 
|x| > R (Figure 2) 


FIGURE 2 


Series diverges 

here j~ 

-R 


Senes converges here 


Senes diverges 
"I here 


0 


R 

Radius of convergence 


PROOF The fact that C(x) diverges if \x\ > R is part of the definition of 
radius of convergence Moreover, if R = 0, there is nothing to prove Thus, 
suppose R > 0 and choose a number y such that 0 < y < R Since by 
definition R is the least upper bound of the absolute values of those x for 
which the series converges, there exists a number z > y such that either 
C(z) or C(— z) converges Whichever of these series converges, the terms of 
that series form a bounded sequence That is, there exists a positive number 
M such that |c„z n | < M for n = 0, 1, 2, 3, 

For |v| < y, we have 


Hence 

E k.*"l < m E N" 

n— 0 n— 0 ' * 


Since |x/zj < 1, this latter senes is a convergent geometric scries This 
completes the proof ► 


464 / POWER SERIES 



coefficients of JV( r) and its derivatives 


To make (18) an identity, we equate 
on both sides, giving 


Co = 


_1_ 

io 


Cl = 


Li 

Ll 


c 2 = 



Substituting back into (16), we obtain 


ft) ""W + ’ ' 

Lq' 


► 


PROBLEMS 

Write the McLaurm expans.on for each of .he follow, ng functions Ind,cate 
the rad, us of convergence of the senes obtained 

l.(u + *)‘ 

, „ , , 11/3 5 . (1 + X 2 ) 1 ' 2 6 - 0 x > 

7 : ( iti 2 8.0-x 2 )- *■ (i - x)i/3 

10 (1 

Use senes lo approx, mate to four decimal places the values of the following 


definite integrals. 

T 1 2 1 

11. / dx 

Jo X 

1,/; 


dx 


n.f 

u.f 


- dx 


VS 


1 *4" x 
In dx 


15 


/: 


(cos xf dx (.Hml use the .dent.ty (cos xf - id cos 2*) ) 

Use th=resu..sofProb.e m s.-10a„dTheoren,4, oflndtheser.es expansions 

of the following functions. 

,-in 17- d - ■*> ' 

19. (1 + x 2 )- 1 ' 2 


16. (1 + 

18. (1 + a :)- 3 ' 2 


20. Use 


t3n ' x = Jo 1 +‘- d ' 

derive a power series expansion for tan' 1 v. 

162 D1F FERENTIAT.ON AND 1 NT EG R AT, O N / 437 



Integrating this scries term by term, we obtain 


r. 


-do = e 


3 3' + 5 5' ' 
3 3' + 5 5' 


. JL+ T 

7 7' T Jo 


7 V 


Example 4 Consider a population which changes only through birth and 
death * Let 0(0 be the number of births at time t, p(a) denote the P r °P ortl ™ 
which survive to age a, and N(l) be the population size at time f Then in 
number of persons of age a at time 1 is 0(r - d)p(a) and it follows that 

JV(t)= [ P(t - a)p(a) da (15) 

Jo 

We wish to be able to express 0(l) in terms of N(t) To do this, suppose that 

«0 - CoJVCO - c.mO + §? N'V) + 


+ (-i)‘ $*<*’(/) + 


(16) 


where the constants c 0 , Ci, c 2 , are to be determined 
Expanding 0(t — a) in a Taylor series about t gives 

0(1 - a) = 0(1) - am + £ f3"(l) - + tzgt 0<‘>(O + 


(17) 


If we differentiate the series (16) term by term and insert the results in (17) 
we find, after some simplification, 


0(* - a) = c 0 N(t ) - (c l + 


ac a )N'(t) + ^ + act + *”'« 

qi + ^ N ., V) + 


Tins scries may be inserted into (15) and Ihc scries integrated term by term 
to yield 

1V(0 = c„L„N( I) - (c,L„ + c„Z.,)(V'(0 

+ (c 2 L u + 2c,£,r + c„i 2 )(V"(0 - (18) 


where 


£j. =» / a L p(a) da 
Jo 


’Adapted from Kc>fitz, N , Introduction to the Mathematics of Population (Addtson- 
Wedcy Publishing Company, Reading, Mass , 1968), Chapter 9 


46S / POWER SERIES 



FUNCTIONS OF SEVERAL VARIABLES IX 


17.1 PARTIAL DERIVATIVES 

In our discussion of calculus so far, we have only considered functions of a 
single variable, that is, functions whose domain is a set of real numbers 
However, functions of n variables, which map n dimensional vectors into 
numbers, are frequently encountered in applications 

Example 1 (a) The grade a student receives in a course in which he scores 

S u S 2 , S 3 , and S 4 on four quizzes is determined by his average score 
A = 1(5*! A- S 2 -f- ^3 4- S 4 ) Here we have a mapping which transforms 
four-dimensional vectors (S u S 2 , S 3 , S 4 ) into a single real number A 
(b) The length of a three-dimensional vector ( x , y, z) is (see Section 8 2) 

L(x, y, z) = Vx 2 + y 2 + z* 

Here L is a function of three variables ► 

The graph of a function of one variable is two dimensional (See, for 
example. Figures 12 and 13 in Section 12 3 ) A three-dimensional graphical 
representation is necessary for a function of two variables 


Example 2 (a) The graph of the function 

' 2 + 3> 2 


171 PARTIAL DERIVATIVES / 469 



21. Show that the series A(x) ~ 2n=o tt n x\ ^'(x) — X) n =o na„x* l , and 

/ qo n+1 

a(x) dx-Y. ~nr + c 

n=0 1 

all have the same radius of convergence {Hint’ Use the ratio test.) 

22. Suppose that the function / can be represented by the power series 
expansion 

f(x) = 2 <*tJ* 

i=0 

and that this senes has radius of convergence R > 0. Prove that for 
k = 0,1,2, ... the coefficient is equal to (l/Ad)/ (4> (0). {Hint’ 
Successively differentiate the senes term by term, setting x = 0 after 
each differentiation ) This proves that McLaunn series representations 
are unique or, what is the same thing, that coefficients of the same powers 
of x in two different power senes representations of a function must 
be equal 

23. In Example 4, let t = 8 — L\/Lq and expand fi{t) around the point 0. 


SUPPLEMENTARY READING 

Hirchman, 1. 1 , Infinite Series (Holt, Rinehart and Winston, Inc , New York, 1962) 
Rainville, E D , Infinite Senes (Macmillan Company, New York, 1967). 


468 / POWER SERIES 



FIGURE 2 



Such a function associates one real number with a collection of n numbers 
Each of these n numbers may be altered either in conjunction with, or in- 
dependently of, all the others Indeed, a classical experimental method studies 
the change in the value of / which occurs when one of the n variables is 
changed while the others are held fixed 

Example 3 That part of the graph of the function f(x, t) = 4 — x 2 ~ t 2 
which falls in the first octant (that is, the region where x, /, and f are all 
positive) is shown in Figure 3 The value /(x, t ) depends on both quantities 
x and / However, if we fix the value of t, say at t = 1, then 

/(x, 0 = 4 - x 2 - t 2 

becomes g(x) = /(x, 1) = 3 — a 2 , a function of x alone It then makes 
sense to speak of the derivative of g, Dg(x ) = Df(x, 1) = —2x, the second 
derivative D 2 g(x ) = — 2, etc , and these derivatives have the same inter- 
pretations as any other derivatives For instance, referring to Figure 3, the 
slope of the curve 

/ 1 and g(x) = f(x, 1) = 3 — a 2 

at the point (x, /, /(v, /)) = (1,1, 2) is 

Dg(\) = — 2(1) - —2 

Note that this may also be interpreted as the slope in the x direction at the 
point (I, 1,2) on the surface of f(\, t) = 4 — t 2 — r 2 . For this reason, 
this derivative is sometimes referred to as a directional demonic. ► 


17.1 PARTIAL DERIVATIVES / 471 



is shown thae-dimensionally in Figure 1 The graph is a bullet-shaped sur- 
face whose height above a point p = (x, y) in the bottom plane is the value 
or the function at that point As an aid to visualizing the graph note tnat 
at points (x, y) where / takes the constant value c 2 > 0 we must have 


! + F 2 = c 2 


(D 


FIGURE 1 



The points which satisfy (1) form a circle with radius c parallel to the bottom 
plane It follows that this circle represents the cross section of the surface 
which lies at a height c 2 above the x,y plane (See Figure 1 ) 

(b) The graph of f(x,y) = Jy^ _ 9x 2 _ 4y2 1S shown in Flgurc 2 
The domain of this function is the set of all points (x,.y) for which 
9 « 2 + Ay 2 < 36 Its range is the interval [0, 2] The shape is that of an 
inverted bowl The height above (x.j’) represents the functional value 


A function f whose domain is a set of n dimensional vectors and whose 
range is a set of real numbtrs is said to be a real ialued function of n variables 


470 / FUNCTIONS OF SEVERAL VARIABLES 



Note that the same result would be obtained from the original expression for 
/by considering Xu A3, and A4 as constants and applying the differentiation 
formula D\\ = nx 2” 1 ► 


Example 5 

is 


(a) The partial derivative with respect to 
/(a, y) = - 3 xy + In (v J + 


dy 


-3a -f* 


2 y 


x - -f- y- 


y of 

y 2 ) 


(b) The partial derivative of g(x,y, z, w) = (a 2 y 2 + z 2 4- w 2 ) v 
With respect to w is 


dg(x, y, z, w) 
dw 


\ (a 2 4- y 2 + z 2 + w 2 ) 1/2 2 w 


(c) Let a 

Then 


(r, s , t , u) and suppose that 

f(r, s, t, u ) = re * — sin + s 3 ir 


dr t t 


a/( a) r r 
— — — = — cos - 
at t 2 t 


ds 

df(x) 

du 


2 s 3 u 


► 


Example 6 The equations 

/(r) - « X F(0 + a 2 A(t) 

f'(o = mo - mo] 

Were introduced in Example 5 of Section 12 1 as representing the relations in 
group behavior among interaction 7, friendliness F, and activity A The 
assumption that an increase in either A or F will bring about an increase in 
7 is expressed mathematically as 

3/ „ , 07 „ 

«i = -- > 0 and «2 = Yi > 0 


The assumption that the rate of change of friendliness increases with an 
increase in / may be written 


b = 


£>° 


► 


Since partial derivatives may be computed just as ordinary derivatives, it 
follows that all theorems concerning derivatives apply equally well to them. 
In particular, there is a mean value theorem for partial derivatives For 


171 PARTIAL DERIVATIVES / 473 



FIGURE 3 


f(x 0 



by varying x, (1 < i < n ) while holding the variables x lt X 2 , > 

x, + i, , x n fixed is called the partial derivative of f with respect to x t In 

symbols, if we let e, = (0, 0, , 0, 1, 0, ,0) be the vector with l in the 

ith position and zeros elsewhere, then the partial derivative of / with respect 
to x, at the point x = (xi, x 2 , , x„ ) is defined to be the limit 


which is denoted by 


»«■)-/(*) 

a- o h 

Of(x) 

dx t 


( 2 ) 


Obviously, by considering all the xl except x, to be held fixed (that is, to be 
constant), the partial derivative may be computed using the same formulas 
as for ordinary derivatives 


Example 4 Let /(x) = /(x,, x 2 , x 3 , x 4 ) = x\ + 2x^x 3 + A 

VM = hrn fix + he 2 ) - f(x) 

0*2 A- o h 

_ |, nl X 2 + h’ *1. *4> — fix 1, X 2 , X 3 , *4> 

A-*0 h 


_ hm + ~ ZxiXz 

A— *o h 


= hm + Mr 1 *, 

A—0 h 


4x 2 x 3 


Then 


47Z / FUNCTIONS OF SEVERAL VARIABLES 



r [ e * ~ 7 C0S ?] ! 


of/, first with respect to / and then with respect to s, is 

a 2 /(*) = a [V(*)1 = eT . . 

ds dr d.s|_ dr J ds i 
The second partial, first with respect to s and then with respect to r, is 

Note that 0 2 f(x)/dsdi equals d 2 f(x)/di ds Also note that the order m 
which the symbols 1 and s appear indicates the sequence in which the partial 
derivatives are to be computed 


The fact that d 2 f(x)/dsdi — d 2 f(x)/dr ds m E x ample 7 is not accidental 
The following theorem, stated without proof, indicates that in all ordinary 
cases the order of computing partial derivatives is irrelevant 

Theorem 2 If the second mixed partial derivatives d 2 f(x,y)/dx dy and 
d 2 f(x, y)/dy dx are continuous, then they are equal ► 


Example 8 Suppose* the utility function for consumption of two products 
is given by 

u — kc“c% 

where c t and c 2 denote respective quantities consumed, and a, ft, and k 
are constants restricted by 0 < a < 1, 0 < ft < 1, and k > 0 The partial 
derivative of u with respect to Ci is 

= kacr'ci = ^ kci4 = ^ ,, 

ac! Ci ci 

Similarly, we compute 

dit ft d 2 U __ a(a — 1 ) d 2 ll __ ft(ft ~ I ) 

3^2 C 2 ac 2 c 2 3c | c 2 " 

and 

a 2 i/ __ a 2 i< _ aft 
dC\ dc 2 dc 2 dCi CiC 2 

Analysis of the partial derivatives shows that if we make the natural 
assumption that c* and c 2 are positive, then du/dci, dii/ 0 c 2 , and d 2 u/dcjdc 2 
are positive, while d 2 u/dc\ and 0 2 u/dc 2 are ncgalnc Hence the marginal 
utility of each product is positive The marginal utility of each product 
diminishes with increased consumption of that product, but increases with 
increased consumption of the other product ► 

•Adapted from Brum, H , Quantitative Economic Theory A Synthetic Approach (John 
Wiley & Sons, Inc , New York, 1968), p 18 


171 PARTIAL DERIVATIVES / 475 



simplicity, we state the theorem for functions of two variables The general- 
ization to any number of variables should be apparent 

Theorem 1 Mean Value Them em fo, Partial Derivatives Let/(x,J>)be 

a function or two variables having partial derivatives 

VM and ^2) 
dx Sy 

which exist throughout its domain Then ir the line joining points (a, j>) and 
0,j>) lies entirely in the domain of f there exists a number c between a 
and b at which 

fib, jO - /(". V) = ( b ~ 

Similarly, if the line joining (x, a) and (x, b) lies in the domain of f, there 
exists a number d between a and b for which 

Note that in the first instance y is held fixed, while m the second, x is fixed ► 

Partial derivatives of higher order may be computed by applying standard 
differentiation formulas to partial derivatives already obtained 


Example 7 For the function /(x) = /(/, s, t, u) = re* — sin ( r/t ) + s 3 u“ 
appearing in Example 5(c) we found 


^>=e* 
dr C 


as the partial derivative of /with respect to r The second pcntial of f with 
respect to r, denoted by d 2 f(x)/dr 2 , is 


d 2 /(x) = d _ 

dr' dr L dr J 


1 

t l 


r 

sin 

t 



The third partial with respect to r is 


dr 


The first partial derivative with respect to u is df{x)/du = 2 s 3 u, the 
second is d*/(x)/d«* = 2s 3 , while the third and all higher partials arc zero 
Pirlial derivatives may also be mixed Thus the second partial derivative 


474 / FUNCTIONS OF SEVERAL VARIABLES 



where Y is national income, C is consumption, / is investment, G is 
government expenditure, T is tax collection, and a , £, 7, and 5 are positive 
constants with 0 and 6 both less than unity 

(a) Solve for Y in terms of the constants and / and G 

(b) Show that dY/dG > 0 and dY/d7 < 0, while dY/d$ has the same 
sign as 07 — / — G — a 


19. Define the function /by 

/(a,>) 


xy(x 2 - /) 
*2 + y2 


when (jc,>) ^ (0,0) and /(a,>) = 0 when x and y are both zero 
Show that the second partial derivatives 


<3 2 /(o,q) and am 0) 

Ox dy dy dx 

of / at (0, 0) exist but are not equal (This gives an example of a function 
for which Theorem 2 does not hold Of course, these second partials 
are not continuous at (0, 0) ) 


17.2 CHAIN RULE FOR FUNCTIONS OF 
SEVERAL VARIABLES 


The Chain Rule for functions of several variables is a generalization of the 
formula 

Dig °f (x)] = g , (f(x))f f (x) 

for differentiating a composite function Ji(x) ~ g° /(a) of a single variable 
(Section 12 4) To take the simplest multivariate case, suppose that H is a 
function of two variables and has continuous first-order partial derivatives, 
while functions f and g are both differentiable functions of a single variable 
Then the function G defined by 

G(t)= //(/(0, g(/)) (3) 

is a function of a single variable Specifically, G changes a real number t 
into the number G(j ) by first computing/(r) and g(f) and then combining 
these results, by means of the function H , to obtain f/(/(f), g(t)) (See 
Figure 4 ) 


FIGURE 4 


172 CHAIN RULE / 477 



PROBLEMS 

In Problems 1-8, find all first partial derivatives of the indicated functions 
1. S{x, y ) = x 2 y - y 2 x 2. Six, y) = xV + J' 3 * 

3. zix, }) = ix 2 + y 2 ) v2 4 - R ( s ' r ) = e_ * 1 


5. /(r, 

7. gix, y) = ix 2 — 6y)(x + 2) 


6 . 

8 . 


u = x sin y + oy sin x 



9. If/(x,y) = e* cos y, find 

§£(0,t/ 3) and |p«W3) 

10. Find the slopes at the point (x, y, z) = (2, —3, 14) of the curves cut 
from surface z = 2x + y 2 + 1 by the respective planes * - 2 and 
y - -3 

In Problems 11-14 find all second partial derivatives of the given functions 

11. u(x, y) = y sin x 12. fir, i) = a sm -1 (r/l) 

13. g(x, y, z ) = e*»* 14. /»(«, u, iv) = e u cos o + <?’ sin >v 

15. Rapoport* considered a two person game characterized by the equations 


51 = In (1 + px + qy ) — px 

5 2 = In (1 + qx + py) — Py 

where x and y are positive quantities denoting the respective efforts of 
the players, p, q, and p are positive constants with p + q = 1 and 
P < P, and Si and S 2 are the respective satisfaction functions of the 
players Show that dSi/dx = dSz/dy = 0 when p q and 



16. If/(x, >) = cos (x + y) + cos ix — y), show that 


3*2 


3V 


17. Show that if/(x, /) = e-*> 2 ‘ sin 0x, then d 2 f/dx 2 = df/dt regardless of 
the value of P 


18. Given the economic model 


NC + /+ C 
C = a + PiY - T) 

T - 7 + sr 

•Hapoport, A , “Some Gamc-Thcorctical Aspects of Parasitism and Svmbtosis,” Bulletin 
of Mat hemal at Biophysics 18 , 15-30 ( 1956 ) 


476 / FUNCTIONS OF SEVERAL VARIABLES 



The computation in Example 2 may easily be verified from known 
differentiation formulas since DG(t) = D[t 2 -f* sin 3 /] = 2r + 3 sin 2 1 cos / 
Obviously, the Chain Rule is not useful for so simple a problem, nor was it 
developed for this purpose Instead, what we have in mind is the following 
more general problem involving partial derivatives 
Suppose that /f, f and g are all functions of two variables Then the 
composite function F defined by 

F(r, s) = H(f(r, s), g(r, s )) 


is also a function of two variables r and s The partial derivatives of F may 
be found by applying the Cham Rule (5) to each variable separately To 
simplify notation, let us agree to write * for fir, s) and y for g(r, s) and to 
indicate the partial derivatives 


dll . 


by 


dH , N Of , N 
ar (r > s) ’ 


dH dH dx 
dx’ dy ’ dr ’ 


df 


0, s), 


^S(r,s), and 


dg 

ds 


C r,s ) 


dX 

ds’ 


dy 
dr 9 


and 


dy 

ds 


respectively Then, holding s fixed so that H is a function of r alone, and 
applying (5), we have 

dF(r, s) dH dx dH dy 

dr dx dr *" dy dr 1 ' 


Similarly, if r is held fixed, we find 

OFjr, s) = dHdx dHdy 

ds dx ds dy ds 


(7) 


Equations (6) and (7) constitute the Chain Rule for functions of two variables 


Example 3 Let ff(x, y) = e zy , x = fir, 9) = r cos 9, y = g(r, 9) = r sin 9, 
and define F by F(r, 0) = //(/(r, 0), g(r, 0)) Then 

6F(r, 0) dH dx dH dy 

dr dx dr dy dr 

= ye* v cos 0 + xe JV sin 0 

= rsm 6 e r2a,nBcoa9 cos 9 -b rc os 0 e r2 * ,nO ' o9e sm q 
= 2rsm0cos0e rWcos * 

Similarly, 

0F(r , 0) dH Ox OH dy 

09 Ox 00 Oy 00 

— ye* v (—r sm 0) 4* xe xv (r cos 0) 

« r 2 (cos 2 0 - sin 2 oy 3 "” 9 ™'* ► 


17 2 CHAIN RULE / 479 



Example 1 Suppose H(x,y) = x 2 + y 3 ,f0) = I, and g(r ) = sin t Then 
G(l) = //(/(<), g(0) = W(/, sin 1) = < 2 + sm 3 t ► 


Being an ordinary function of a single variable, the function G in (3) has a 
derivative defined in the usual way by 


G'(t) = bm 

j-»o 

= hm 
h~* o 


G(t + h) - G(Q 

h 

HUV ± h), 8« + M ~ g I/C0* g(Q] 


(4) 


In order to develop the Chain Rule, we first add and subtract the quantity 
H[f(t + h), g(t)] m (4) to obtain 


G'(0 = hm 

h—tQ 


H[f(t + h), g(t + /»)] ~ + h), g(Q] 


+ lim 
h—Q 


H[f(t + /Q, g(Q] — g(Q] 


Now applying the Mean Value Theorem for Partial Derivatives (Theorem 
1) gives 

G'CO = tan + »).»][ ,(< + *)-«(») ] 
fc— »o dy l h J 

+ hm dH[d, g(/)] | 7(/-WQ -/(/) ! 

/— o dx 1_ h J 

where c is a number between g(/) and g(/ + h ) and lies between /(/) and 

/« + *) 

The assumed continuity of / and g means that c— * g(t) and </—►/(/) 
as /i ► 0 Taking limits we obtain the following important result (in its 
simplest version) 


Chain Rule ir the function G is defined by (3), then 

gw = /V) + m/o\gQ)i , 0) ( 5 ) 

ox dy y 

Example 2 In Example 1, we hare H(x, y) « x 2 + v 3 . fO) = r, and 
g(/) = sin / Thus 

^(X,)) = 2x, ~ (x, y) = 2y 2 t /'(()»!, g'(<) = COSf 

Applying (5) gives 

c . (0 = + fi , (0 

“ 2 /(0/'(0 + 3g 2 (/)g'(0 

= 21 + 3 sm 5 / cos r ► 

478 / FUNCTIONS OF S EV 


ERAL VARIABLES 



PROBLEMS 


In Problems 1-4, apply Equation (5) to find u’(j) (The symbols it, x, y, and 
= are shorthand for functions «(/), *(/), >(/), and z(t) of t ) Check by elimina- 
ting x, y , and z before dilTcrentiating 

1. u = .v 2 + y- % x = t(/) = (/ + 1 )//, y « y(t) = t/(t - 1) 

2. u — x 1/a + 2 xy 2 — 2) 1/2 , x = 1 + / 2 , y — 1 — t 2 

3. u = t~ sin \y,\ — / 1/2 ,> = In / 

4. ;/ — x cos ; -f- ) sin r, \ = /, y = e*, z = 1/2 / 

In Problems 5-10, iv is a function of a and y (or of x, j , and z ) which are, 
m turn, functions of r and t Use Equations (6) and (7) to find dw/dr and 
dw/dt Check by eliminating x, y, and z first 

5. u = x~ — y 2 , x = r(r, /) — r sin /, y — y (r, t) = r cos t 

6. n> = e'~ r,u , x~2r — t,y = r J r2t 


7. w = tan -1 ( y/x ), x = r + a sin /, y = r — a cos / 
8 w — xe 2u — ye x , x — rt % y — t/r 


9. w — y/ x 2 ~h y 2 + z 2 , a = rt, y = r sin /, z ~ t cos r 


10 . «/ = 


y2 + * 


, x = e , y ~ re 


1 

t 


11. Suppose u = n(x, >), while x ~ r cos 6 and y = r sin 0 

(a) Find du/dr and du/dd 

(b) Show that 


du . du sin 6 du 


du 

dy 


cos 0 du 

" r dd 


12 Suppose iv = /(n/u) Prove that 


dw , dw n 

°to + u T»=° 


In Problems 13-17, assume that the given relations determine y as a function 
of x Find dy fdx m each case 

13. x* ~ y 2 + 2axy = 0 14. xy — Sin y = x 2 

15. c sin i + e'smx* l I<>. sin — — cos - =* ^ 

y x 2 

17. In y -f- ye* = 10 

172 CHAIN RULE / 431 



Example 4 The real advantage of the Chain Rule for two-variable func- 
tions comes when it is impossible or impractical to solve for some of the 
variables in terms of the others For instance, suppose we consider the 


equation 


H(x,y) = 0 


as implicitly specifying y as a function of x Applying (6) we compute 
dH , dH 


0 = DH(x,y ) 




from which it follows that 


/(*) = 


— dH/dx 
dH/dy 


To take a specific illustration, suppose 

x 2 y + x sin y = 1 

Then H(x, y) = x 2 y + x sin y — 1 = 0, dH/dx — 2 xy + sin y, 
dH 2 , 

¥ = * +XCOS? 

and 

v i (x \ = - dH/dx _ _ 2 xy + sin y 
dH/dy x 2 x cos .y 


► 


Example 5 Consider a single-commodity market m which demand D is 
assumed to be a function f(P, Y) of both price P and income Y, while 
supply S is a function g(P ) of price alone The equilibrium position of the 
market is determined by the condition 


g(P) = f{P, Y) 


(8) 


which determines price as a function P * of income 
relation (8) gives 




OP* 


af_ , K 

dP* OY dY 


Differentiating the 


front which the derivative of P» is obtained as 


sf! _ sf/ar 
dy s'(P') - af/ap » 


Let us assume that demand increases as income rises ( af/dY > 0) and 
decreases with a rise in price (df/dP < 0) and that supply increases with price 
y /dl > Then dP*/0Y is positne and it follows that an increase 
t ccrcasc) in income will result in a corresponding increase (decrease) in 
equilibrium price ^ 


480 / FUNCTIONS 


OF SEVERAL VARIABLES 



Writing .v = f(t) = rfii and y = g(t) = tb 2 m the Chain Rule (5) gives 

G'(t) = •/'(/) + g'(0 


" 6 > “sr + 


3HQb) 

dy 


(9) 


for the first derivative of G 

The Cham Rule may now be applied to obtain the second derivative of G 
Each term in (9), being a function of both x and y, will produce two new 
terms We obtain 


G"(t) = Z)G'(f) 
= b 


au(ib) amib) 

blD ~a]r + b2D -ar 


ip»/'(0 

+ 6 =[w /,(,) 


^ dydx SK J J 




l2 a 2 H(tb) , d 2 /f(/i) t2 

= 6. -jjr- + 6261 ~ Yya , r + 6162 UaT + 62 -3^ 

l2 a~H(tb) , a 2 f/(/h) , ^a-H(ib) 

= 61 + 26162 .ivar + 62 


The pattern for higher derivatives should now be apparent Differentiating a 
third time gives G'"(i) on the left side, while on the right side each term 
generates two new terms corresponding, respectively, to the partial derivatives 
With respect to x and to y Each differentiation with respect to x produces 
another factor b\ — fit), while b 2 = g'(t) appears with each differentiation 
with respect to y Terms which differ only in the order of differentiation are 
equal and can be combined Thus, 


G n ,( A _ .3 d 3 H(lb) 2 d^mib) 2 a J //(/h) 3 d*H(tb) 

G (0 - b x -^3- + ^b 2 + 36j62 dxdy* + 62 


G (J, (0 


,4 a* mm . ..3, a 4 //(/h) , 2 2 d'HOb) 

; bl + 46162 + 66162 

, „ lS a'Hdb) , 4 3 4 w(/h) 

+ 46162 a^ + 62 -aF - 


and, in general. 


6W (') = S 6 )‘ ! ‘ r ‘a 7 ? 


may be pro\ed by induction 

We now substitute these results and put t = 1 in the Taylor Formula 


GO) = G( 0 ) + G’( 0 )t + 


G"( 0) , 2 


2' 


r + 


+ v- i) r ' + ' 


173 TAYLOR'S FORMULA-OPTIMIZATION / 483 



Example 2 For the utility function it = kClCi in Example S of Section 
17 1, the first and second partials were found to be 


Du a. Jhr _ ft ’ll __ a(a — 1 ) 

dc t ~ c, dc. ~ c 2 ^ ~ ' ^ " 


il'll ^ cfili _ — 1 ) ^ 

<5Ci dC-2 ~ Cl?2 ' Act cj 


Hence the Taylor polynomial of second degree around the point a = (2, 1) is 
«(ci, c 2 ) = ii(2, 1) + (Cl _ 2) + — (c 3 - I) 




d~li(2 % t ) 

acf 


(c, - 2) J + 2 (c, - 2)(c 2 - I) 


OCi dCt 


+ — (c 3 - 1)= 

dcT 


= 7 2“ 2 |J [l + "(c, - 2)+ 0(c a - l)j 

+ « 2" (c, - 2f + (c, - 2)fe - !> 

'■]' 


+ W - 1)(*. - 0 


The ideas concerning points of relative maximum and minimum of a 
function of several variables are much the same as for functions of a single 
variable (Section 12 8) Thus if / is a function of n variables, the point 
a = (a i, a 2 , , a n ) is a point of relative maximum for f if there is a positive 
number 5 such that fix) <f(a) for all points a which lie within 5 distance 
of a Similarly, if the inequality /(x) >f(a) holds for all x within 5 distance 
of a, then a is a point of relatiue minimum for f 

Example 3 The point (x,y,f(x,y)) = (1,2,4) is a relative maximum 
point for the function f(x, y) = 2a 4* 4y — v 2 — y 2 — I, as may be seen 
by writing f in the equivalent form /(x» y) — 4 — (x — l) 2 — O’ — 2) 2 
Consider the function 

g(x) = f{x , 2) = 2x + 8 - a 2 ~ 4 - 1 = 2x - v 2 + 3 

obtained by fixing y — 2 Geometrically, the graph of g is the intersection 
of the surface representing f and the plane whose equation is y = 2 (See 


173 TAYLOR'S FORMULA-OPTIMIZATION / 485 



for one variable (here c is between 0 and /) In terms of H, this gives 

H(bi, b 2 ) = H(fi) 




+ + 


(n - 1)' 

, 1 (n\ d"H(cb i, cb 2 ) u t u n-i 

+ ^LV)l>Je ^ blb2 


( 10 ) 


where c is between 0 and 1, as the Tayloi Formula for two variables 
Obviously, we may go through the same derivation with an arbitrary point 
a = (flu a 2 ) in place of the point (0,0) The quantities b\ and b 2 are l ^ en 
replaced by 61 — a\ and b 2 — a 2 , respectively 


Example 1 For the function f(x, y ) = e x sin y, the first, second, and third 
partial derivatives are 


dx dx n dx^ 


e x sin y 


9f _ d 2 f d z f 
dy dxdy dx*dy- e cosy 


a 2 /^ a 3 / , 

dy* dxdy* e smy 


0 3 / 

= -e cos _y 


At (a >) = (0, 0), the respective values are 0, 1,0, and —1 Together with 
/(0, 0) = 0, this gives, for the Taylor polynomial of third degree, 

p(vo) = 0 + 0 x+1 ^ + 1 0 x 2 + ^ 1 xy + ^ 0 / 

+ 5> 0 ^ + V 1 ^ + j 0 V + Jj(-l)/ 

“ y + *> + \x 2 y - ly 3 

The error term involves all fourth partials ► 


4&4 / FUNCTIONS OF SEVE 


HAL VARIABLES 



be a function of n variables and let a be a point of relative maximum or 
relative minimum which is interior to the domain of / Then all the first 
partial derivatives of /, evaluated at x — a, are zero That is, if the interior 
point a is a relative maximum or minimum, then 

d/(a) _ Of (a) _ = W ^ q w 

dX\ d\ 2 dX n 


Example 4 The function f(x,y) = x 2 — y 2 , shown m Figure 6, has 
neither a maximum nor a minimum at (a, y ) = (0, 0) However, both 
partial derivatives df/dx = 2x and df/dy = — 2 y are zero at (0, 0) To a 
person walking along the surface in the plane y = 0, the origin looks like 
a minimum To a person walking m the plane x = 0, the origin looks like a 
maximum Such a point is called a minima) t, or a saddle point, of the function 


f(*y) 



Example 4 shows that all partial derivatives being zero at a certain point 
a is not sufficient to guarantee that a function has a relative maximum or 
minimum at a The same situation arose with functions of one variable 
In that case we resorted either to examining the first derivative near a or to 
a test involving the second derivative For functions of several variables the 
latter alternative is best We use Taylors Formula to develop such a test, 
again developing the theory only for a function of two variables 
Suppose, then, that for ihe function f(x,y ) both partial derivatives 
df/dx and df/dy are zero at the point a Taking n = 2 and choosing a 
point b — (b u b 2 ) near a = (a\ a -»), the Taylor Formula (10) becomes 


m = /(a) + I (6, - ai ) 2 + (6. - ~ «=) 

07(c) 

0 ,- 

"here c = (c u c 2 ) is a point between a and b 


+ V>- ~ a*) 3 ] 


173 TAYLOR S FORMULA — OPTIMIZATION / 487 


FIGURE 7 


P 



to pass through all the points. In general, this will not happen, so he looks 
for the line, called the least-squares line, which minimizes the sum of squares 
of the vertical distances between the lme and the plotted points. That is, for 
given points (A u Pi), ( A 2 , P 2 ), • * • , (^»* P,»), he seeks to minimize 
g(o, (3) = (Pi - aA, - 0f + (P 2 - aA 2 - (3) 2 
+ ••• + (/>.- aA, ~ 0) 2 

= £ (P, - aA, - 0f 

J-l 

The distances are squared in order to avoid the situation where negative and 
positive deviations cancel, leaving a zero sum even though the line does not 
pass through all the points. 

The first partial derivatives of g are 

= 2 £ (/*, - aA, - 0)(-A,) 

da 

= 2 £ (P.-«A,- 0)(-l) 


Setting these equal to zero leads to the equations 

A . p . 

,-i .-1 ■-! 

aT,A, + nfl = £ P, 

I«1 t~l 

which must be solved for a and £. 

The second partial derivatives of g arc 

= 2 V /<:, = 2 £ and - 2n 

da - cfot)/? 9/3- 


(11) 


17.3 TAYLOR'S FO R M U LA — O PTI M IZATIO N / 429 


The difference is thus represented as a quadratic form (see 

Section 10 3) 1/2' A 'A X, where X 1 is the row vector X' = (hi - ai ,»2 
and A is the symmetric matrix 

( af/(£) a 2 /(c)\ 

dx- dx By \ 

a 2 /(e) aV(c) 1 

dy 2 / 

of second partials of/evaluated at c In this form it is obvious (assuming that 
the second partials are continuous) that a is a point of relative minimum for 
/if and only if this quadratic form is positive for all points b near a This is 
the case if and only if the matrix A is positive definite (see Section 10 3) 
Similarly, a is a point of relative maximum for / if and only if A is a 
negative definite matrix If A is an indefinite form, a is neither a relative 
maximum nor a relative minimum, while if A is semidefinite, the test fails 
and no specific result may be claimed 


Example 5 (a) For the function fix, y) = 2x + 4y — x 2 — y 2 — 1 con * 
sidercd in Example 3, we have 


&£=2-2x, fit. 


4 — 2y tl. 
y ’ dx 2 


-2, and 


d 2 f _ 


= 0 


, a. _ 

Bx * By " dx' 2 ’ 3y 2 *’ “““ dx By 

The first partial derivatives are both zero at the poult (1, 2) The matrix 


■■ -( 1-3 


of second partials is negative definite everywhere Hence / has a relative 
maximum at (1, 2) 

(b) For the function f(x, y) = * 2 - y 2 in Example 4, df/dx = 2x, 
Of/dy = -2 y, d 2 f/dx 2 = 2, d 2 f/dy 2 = -2, and d 2 f/dxdy = 0 Both first 
partials arc zero at the origin, but the matrix 



of second partials is indefinite Hence the point (0, 0) is neither a maximum 
nor a minimum ^ 


Example 6 A common statistical problem is to find the straight line which 
tx.st fits a set of data points plotted in the plane What straight line best fits 
t c sc\cn points plotted in Figure 7 which record the results of seven experi- 
ments relating amount of advertising A and total profit P 1 The statistician 
answers as follows Ideally, he would like the chosen line 

P — f(A ) = aA + 0 
483 / FUNCTIONS OF SEV 


ERAL VARIABLES 



13. A firm producing two commodities has the revenue function 
R = PU 7i + P2q2 

where p t is the price and q, the quantity of the ith product (/ = 1, 2) 
The cost function is 


c = 3 9 ? + 2 q m + 3 ql 

Assuming fixed prices p\ and p 2 , show that profit is maximized when 


91 


3pi — P2 
16 


and 


3^2 — Pi 
® “ 16 - ' 


14. If the cost function in Problem 13 is C ~ q\ -f Sqiqz + q\, is there an 
interior maximum for the profit function 9 Why or why not 9 

15. Take a = 2 and 0 * 3 m the utility function of Example 8 of Section 
171 and Example 2 of Section 17 3 Write the third-degree Taylor 
polynomial approximation to this function 

(a) around the point (0, 0) 

(b) around the point (1, 2) 

16. Find the least-squares line of the form y — ax -f- b for the data below 
Plot the points and sketch the least-squares line 


( I 1 2 3 4 S 


X, 

>• 


0 2 16 4 

1 9 4 21 15 


17. Suppose that observed data consists of a set of it ordered 4-tuples 
(*„ z„ u,), i = 1, 2, , n 

(a) In terms of lhc quantities x„ >„ and », find equations for 0i, 
02 , and 02 to give a least-squares surface of the form 


It = 0lX + 02) + 0iZ 

(// mt Minimize 21?.! (», — 0i*, — 02), ) 

(b) Apply jour results to the data 


3. 


>i, 


1 

0 

1 

0 

3 


2 3 4 5 

1 4 2 1 

-13 10 

2 0 11 

6 5 4 3 


173 TAYLOR’S F O R M U LA — O PTI M IZATIO M / 431 



It is easily verified that the matrix 
M = 2 


',1=i a: 1 ^.\ 

VE’.‘- iA, n ) 


of second partials is positive definite Hence the solutions of the Equations 
(11) yield a point of relative minimum for g 
For the points m Figure 7, we compute n = 7, 


X A, = 30, X Pt = 32, X A, = 206, and X = I 75 

,=1 »=i .=1 *=1 


The equations 


206a + 30/3 = 175 


30a + 7/3 = 32 

have the solution a = fff and /3 = g lvin S /04) = fiiM + 

(or approximately, f(A) = \A + f) as the equation of the least-squares 
line ^ 


PROBLEMS 

In Problems 1-10, find maximum and minimum points of the indicated 
functions Show all tests 

1 fU, y) - * 2 + 3 y 2 - 2 a : 

2 gC*. 0 = j 3 - 3/ 2 + st 2 

3 *(*,>) = 2xy 

4 i/(r, 0) _ r 2 - 2fl 2 

5 /(u, t) = l/u + uo - 2/o 

6 u(x,y) = jr>(6 - x - y) 

7 AC*i.*a)- (*i + * 2 ) 2 -x?-* 2 

8 fix.y,:) = x 2 + >2+ , 2 

9 fiUi, ATo, Xj) - (ti + x 2 + x 3 )° 

10 Kxu x 2 , X 3 ) = (XI + AT 2 + X2? - x\~ xl- xl 

11 bind the largest product p\p-,p 3 which can be formed subject to the 
restriction pi + p* + p 3 = i 

12. A monopolist products commodities A and D having demand functions 
x(a) =» 18 — 2a and y (6) =10 — 6, when, a and b art the respective 
unit prices If the joint cost function is C{a,b) = [x(a)l 3 + [><*)1 2 . 
find the prices and corresponding outputs which maximize profit 


430 / FUNCTIONS OF SEVERAL VARIABLES 



Example 1 Consider the problem of maxmnzmg the function 

f(x,y) = 


(15) 


subject to the restriction 

/.(x,j>) = -t 2 + r-i = o 1161 

. , 1 roller I or v in ( 16 ) and substitute the result into 

5!',« ZZSV ISS i «* » 

differentiate the functions as they stand 
We find . 

dx dx ^ 3y d\ 


dx 


dli 

dx 


ah , ah dy 


dy 


_i + i^^ = 2x + 2p-v- 
dx' ay dx ax 


( 18 ) 


/ r n at into 07) and equating to zero m 
Substituting dy/dx = -x/y from (IS) into UO 

order to maximize gives 

2x- (2y + 1 )(-^) = ° 

.[»»{]- 

. _ 1 1 Jsma (16), the critical points 

This is zero when x = 0 or when y -> (_ % /15/4, - 1) 

are thus seen to be (0, 1) (0 - • (V > A 
Now Df(x) ~2x- <2y + D(-*/W “ 4 * + x,y 
„ dDf(x) , aDf(x) dy 

D-f(x) = + ay dx 


- 4 + - 


?B) 


: 4 + 1 + — 

^ y V 


This^positiye at T^'V.nt^wNch/lias a 
(-V15/4, Hence , of these points is 

relative maximum The value of/ 




, , ar nves at the same results by different. 

The method of Lagrange nmlnpl^ rs subjLCl lo = o, we 

but often simpler, means To maxmu ’ where \ is a constant 

introduce a function Fix.}) ’ A 

called a Lagrange multiplier, to * c c 
•After Joseph Louis La^c 1173MMM. n I rench mathematician 

174 OPTIMIZATION WITH RESTRAINTS / 4» 



17.4 OPTIMIZATION WITH RESTRAINTS— 
LAGRANGE MULTIPLIERS 


The tests for relative maxima and minima described in the preceding section 
depend on finding these points in the interior of the domain of the function 
concerned However, it is not unusual in applications to find functional 
restrictions imposed on the domain by the nature of the problems being 
analyzed Geometrically, such restrictions confine the domain of interest to 
some curve or surface in n dimensional space 
To take an already familiar example, we have seen that maximum and 
minimum solutions of linear programming problems always occur on the 
boundary of the set of feasible solutions (see Section 9 5) Thus, solutions 
would be unchanged if the boundary of the feasible region were retained 
and the interior region deleted In this case, the domain of the linear function 
to be optimized becomes a collection of lines or planes, the points of which 
lie on the boundary, rather than in the interior of the feasible region 
Since we already know how to solve linear programming problems, we 
shall restrict our attention here to problems involving either nonlinear 
functions or domains which are not simply lines or planes 
For functions of two variables, the general problem takes the following 
form Maximize or minimize the function f(x, y ) subject to the restriction 

h(x,y) - 0 ( 12 ) 

The relation (12) may be thought of as determining a function g such that 


Kx, g(x)) = 0 


for all x If it is inconvenient or impossible to solve for g(x), we may set 
t = f(t) =* xin Equation (3) and apply the Chain Rule (5) to obtain 


so that 


dh ah 

dx dy 


«'(*) = o 


**(*) ■ 


dh/ dx 
dh/dy 


( 13 ) 


To maximize /we define a function k by k(x) = f( x , g(x)>, again use the 
ham Rule to differentiate k, and equate the result to zero This gives 


dk 

dx 


%+%*(*'> ■■ 


or, using (13), 


2/ . V, 

dx dy 




dh/dx\ 

dh/dy) 


= 0 


(14) 


Solving lh,s equation simultaneously „,th Equation (12) provides the 

critical points of the function 


492 / FUNCTIONS O 


F SEVERAL VARIABLES 



Example 3 An Allocation Problem The quantities Qy and Q 2 of output 
of a two-product firm depend on labor L and capital K according to the 
production functions 

Qi = a t LVKf l and Q 2 = a 2 V?J& 

Here a y and a 2 are positive constants, while at, a 2 , flu and 0 2 are constants 
between 0 and 1 There ts a fixed total quantity S of labor and a fixed total 
quantity T of capital available to the firm If we assume that the respective 
prices pi and p 2 are fixed, the problem becomes that of allocating labor and 
capital to the two commodities m order to maximize revenue Symbolically, 
we seek to maximize 

R = P1Q1 + P2Q2 = + Ps a 2 L?Kp 

subject to the restrictions 

Ly -f- L 2 — S = 0 and Ky + K 2 ~ T = 0 (2i ) 

To do this we introduce a function 

F(L u L 2 , K u K 2 ) = p iai LVKt' + p 2 a 2 L?Ki* 

- x,(i, + l 2 -s)- x 2 (xr, + K 2 -T) 

involving two Lagrange multipliers Xi and X 2 The partial derivatives are 
~ = PiamLV-'Kt' - X, 

~ = PsawLp-'Ki’ - X, 

~ = P^LVKf'- 1 - \ 2 
~ = Pia&LTKi*-' - X 2 

Equating these to zero and solving these equations in conjunction with 
Equations (21) give 

Pl a iai Vi'- l K?' - p 2 a 2 a 2 (S - L 2 T‘-\T - K{f a = 0 ( 22 ) 

PiatPiLVKl'- 1 - p 2 a 2 MS - l-iTHT - = 0 (23) 

Multiplying Equation (22) by p 2 (S — L i> and Equation (23) by 

a 2 (T ~ Ki ) 

and subtracting the resulting equations yields 

aypyLV^lft-'WlteKliS ~ “ Ct^yLyiT - Ky)] = 0 

Since ay, p y, Ly, and K\ are all positive, this simplifies to 
<X\(i 2 K\ (S — Ly) — CtzfiyLyiT — Ky) 

174 OPTIMIZATION WITH RESTRAINTS / 495 



Since h(x, y) = 0, F should be maximized at the same points as/ We use 
the Chain Rules (6) and (7) to find partial derivatives of F The two equations 
obtained by equating these partial derivatives to zero, together with the 
restriction h(x, y ) = 0, may then be solved for x, y, and X 
To look at details, we compute 


df _ 
dx dx 

Equating these to zero gives 


X^ 

and 

dF _ 

fif- 

xf 


dx 

dy 

ty 

ty 


df 


= 0 



(19) 

dx 

dx 




ti- 

x“ 

= 0 



(20) 

dy 

a y 





Solving for X in (20) and substituting into (19) gives 

df _ df(dh/dx\ = 
dx dy \dh/dy) 

as in Equation (14) 


Example 2 We use the Lagrange multiplier method to reconsider Example 
1 To maximize /(x, y) = x 2 — y 2 — y subject to the restriction 

K*, y) = x 2 + y 2 - 1 = 0 

we define 

F(x, y) = /(x, y) - X/i(x, y) 

= x 2 -y 2 ~ y- X(x 2 + y 2 - 1) 

Then 

^=2x-2x\ = 2x(1 - X) 


— = -2y-l-2y\= -2y(X + 1) - 1 

Solving the equations dF/dx = 0 and BF/By = 0 gives * = 0 or X = 1. 
and 2y(X + 1) + 1 = 0 For a: = 0 h(x, y) = 0 yields y = ±1 If 
X - 1, then 2 y(X + 1) + 1 = 0 requires y = _i which in turn gives 
x - ±\/15/4 The critical points are (\/l5/4, _ >), (-^15/4, - •), (0, 1), 
and (0 - 1) as before Notice how much more easily the results’are obtained 
here than in Example 1 ^ 


m J h= .!" elh0d ° f Lagrange multipliers applies equally well when there is 
" °" e restriction, provided only that the number of variables is 

greater than the number of restrictions 

434 / FUNCTIO 


MS OF SEVERAL VARIABLES 



( 21 ) 


1 = 1=1 Jn Allocation Problem The quantities 2 i and 2a ° f out P ut 

ora7l pLt;“^cn labor ^and capital * accord, n g to the 

production functions 

2i = aiLVKl' and 2a = a 2 I#Kf 

between 0 and 1 There '\^ X f o t firm If we assun,e that the respective 
quantity T of capl fi “ed “he prob lem becomes that of allocating labor and 

subject to the restrictions 

Ll + L 2 - s = 0 and Ki + *2 - r = 0 

To do this we introduce a function 

F(L„L 2 , jc„ *) = K{ ‘ 

- x.CT, + i 2 - S) - x 2 (a:, + 

involving two Lagrange multipliers X, and X 2 The partial denvat.ves are 

AIL = 

dL\ 

EE = /J 2 a 2«2^2 2 1 -^2 2 “ 

0Z«2 

it = p iai fhL°M'-' - x 2 

3/Ci 

it - p 2 a 2 hL°M>-' - X 2 
6^2 

, „ „ nH solvmE these equations in conjunction with 

Equating these to zero and solving 

Equations (21) give „ nSi _ „ (2 2) 

(23) 


Pl a iai L%'- l K{' - p 2 u 2 a 2 (S - ‘< r “ Al) ^ " ° 

PiOtfiiLVKi'- 1 - P*2fh(.S ~ ' Kl > - ° 

Multiplying Equation (22) by 3 2 (S - Li) and Equation (2 ) y 
„ 2 (r - Ki) 

and subtracting the resulting equations yield 

atpiLV-'lfr-'W^S - Ll) - a2 ^ L ' iT " ^ = ° 

Since p„ 2.1, and JC. are all positive, this simplifies to 

ai 0 2 /vT(S - Li) = asPiLiCT - A 'i) 

174 OPTIMIZATION WITH RESTRAINTS / 495 



or, equivalent!) , 


C*i/g2 _ (£l/*S) L\_ 

JJf 2 - (A'i/'T) -*i 

For simplicity of interpretation, suppose that a, = 02 end ( 3 , - ft 
Then the revenue function Ji has an extremum wherever 
(Li/S) -Li = (KJT) - Ki 

that is, whenever capital is assigned to the two products in the same rat^ 

as labor 


Example 4 Symmetric matrices of the form 

A 1,2 r„ r,.\ /mi M. ms 


ri2 1 t"23 

A =1 ri3 123 1 


| and B = 


I 12 * 22 ^23 

>13 ^23 ^33 


rm 

T2» 

rzn 


yin 1 2 


Vln * 2 » ^ 3 n 1 / V 1 » 9 2 * r * n 

arise in the theory of mental tests A common problem is to find a vec 
of weights 

W = (iVj, iv 2 , , w„) 

which will maximize the reliability function 
mw , _ WBW‘ 

I VAW 

subject to the restriction WAW' = 1 Note that R(W) is the ratio of t»° 
quadratic forms with matrices B and A, respectively 
Following Gulliksen,* we introduce a function 

L(\V) = WBW’ - \(WAIV' - 1) 

involving a Lagrange multiplier X The partial derivatives of L(W ) ' ^ 
respect to uq, iv 2 , , h>„ may be written in matrix form as WB " 
Equating these partial derivatives to zero gives 

( 24 ) 

WB - \\VA = O or W(B - \A) = O 
as the matrix equation which the maximizing W vector must satisfy 
onJi f rccaH (Section 10 1) that Equations (24) have a nonzero so 

00 y 1 (26) 
det (B ~ \A) = 0 

However, Equation (25) may have more than one solution and vve ^ ^ 
decide which to choose We argue as follows If we multiply the fir 

TlK ° ry ° f Mu,tal Tesls < Jolm Wiley & Sons, Inc , New York, l950) 


496 / FUNCT10 Ns 


OF SEVERAL VARIABLES 



Equations (24) by \V\ we obtain WBW' — MV AW* = 0 as an equation 
which X must satisfy. Solving for X gives 

WBW 

WAW 

a value of the reliability function R. Since we wish to maximize R, it follows 
that we should first choose X as the largest solution of (25), then substitute 
into (24) and solve for the corresponding W. ^ 

PROBLEMS 

1. Maximize /(Y,y) = \ 2 — 2) 2 4* 2 Uy 4- 4r subject to the restriction 
Zx — y = 0. 

2. Maximize ;c(y 4* 6) subject to \ 4- j «= 2 

t 3. Maximize g(x, y, w) — 2.x 4- y 4~ 2 w 4- x 2 — 31V 2 subject to 
x + y + w — 1 = 0 and Zx — y 4- w = 2 

4. Minimize //(a, y) = x 2 4- J 2 subject to x 2 -j- y 2 — 2x — 4y 4- 2 = 0 

5. Maximize R(u, c, w) = 10m , w subject to ir + c 2 4- w 2 = 4 

6. Maximize f(x i, * 2 , . , x„) = (xiv^ • x„) 2 subject to the restriction 

*f + •*! h — 1-4 = 1 

7. Sales 5 of a firm depend on material cost C and labor cost L according 
to 

S(C, L) = 5CL - 3 L~ 

If the budget is such that 

C + L = 15 

find the maximum sales attainable 

8. Assume that the utility of quantities x and y of two commodities is given 
by a function U = F(x, y) where dF/dx and BF/dy are positive Given a 
fixed budget total B and fixed prices p and q for the two commodities, 
prove that the consumer, m order to maximize utility, must allocate his 
budget so as to equalize the ratio of marginal utility to price for each 
commodity. 

9. An indifference curie is defined as the locus of points (v, j) such that 

F(v,y) = constant 

Show that m order to maximize utility in Problem 8, the consumer must 
allocate his budget so as to equate the slope of his budget line with the 
slope of some indifference curve 

10. Find the utility-maximizing demand for each of two goods if the utility 
function is 

£/(v, j) *= I Ox 4- 30y — x 2 — 4> 2 


17.4 OPTIMIZATION WITH RESTRAINTS / 497 



or, equivalently, 


ai/0£2 _ y-n 

0r7S ~ (KJT) - Ki 

For simplicity of interpretation, suppose ■■<** «t 
Then the revenue function R has an extremum wherever 

(Li/S) - L, = (Ki/T) - *t 

that is, whenever capital is assigned to the two products in the same «■» 
as labor 


Example 4 Symmetric matrices of the form 


A r l2 

'13 

»ln\ 

t / 

f 12 1 

r 23 

1 2n 

\ 

*T3 ^23 

l 

fZn 

land B — 

Via r 2n 

r 3 n 

1 / 

I 


rn /*12 r i 3 
ri2 r 22 r 2 3 


/m *2n r * n 


of weights 

H' = (w u w 2 , > w ») 

which will maximize the reliability function 

n/wv WBW 

W AW at io of ta fl 

subject to the restriction WAW' = 1 Note that R(W) > s the raU 
quadratic forms with matrices B and A, respectively 
Following Gulliksen,* we introduce a function 

L{W) = WBW - \{WAW' - 0 fllt h 

involving a Lagrange multiplier X The partial derivatives _ 
^pect to Wl , >v 2 , t w n may be written in matrix form a 
viating these partial derivatives to zero gives ( 2 <) 

WB — XWM = O or - \A) ^ 0 

''e TOah ^'“' t,0n wh,ch < he maximizing W vector sol»" 0 “ 

only if (Section 10 1) that Equations (24) have ^ 

Howev det “ X/4 ) = 0 d 

decide whS?'™ < 25 > ma y have more than one solution mV &1 ‘ ° 
10 choose We argue as follows If we muWP'f 
•Gulliksen, H *jew 

pp 346-348 0r y °f Mental Tests (John Wiley & Sons, * nc » 


1) that Equations (24) have * 


496 / FUNcr, 0Ns 


0F several VARIABLES 



these volumes gives 



as an approximation to the actual volume. 

As the number n of intervals is increased, it is reasonable to expect the 
sum (26) to converge to the actual volume. On the other hand, the limit 



is, by definition, the integral 

f A{x)dx ( 27 ) 

Jo 

of the function A over the interval from x — 0 to x = 4. We are thus led to 
define the volume in question by the integral (27). 

The area A(x) may itself be computed as an integral. For instance, when 
x = 1 ,/(*, /) = /( 1, t) = 4 — 1 — t 2 = 3 — t 2 and A( 1) is the area 

fV5 

A( i)= / ( 3-r)d t 

Jo 

shaded in Figure 9. (This is the same as the area shaded in Figure 8 if 

4 k/n = 1.) 

figure 9 


/ 0,0 



17.5 ITERATED INTEGRALS / 433 



and the budgetary restriction is 

B = 20 = px + qy = 3x + 4> 

11 Let B be a p component column vector and £ a p X p symmetric matrix 
Given the problem of maximizing B J^B subject to the constraint that 
B'B — 1, show that 

B'Y,B = \B'B » X 

where X is a Lagrange multiplier ( Hint See Example 4 of this section ) 


17 5 ITERATED INTEGRALS 

In Chapter 14, we used the concept of area to motivate the definition of the 
integral of a function of a single variable In a similar way, the problem of 
computing the volume of a three dimensional solid leads to the definition 
of an iterated mtegial of a function of two variables 


Example 1 To find the volume of the solid lying in the first octant and 
bounded above by the graph of f(x, t) = 4 - * — t 2 (see Figure 8), we 
begin by partitioning the interval from x = 0 to x = 4 into n smaller intervals 
determined by the points 0, 4/;i, 8 /«, , 4 We then approximate that part 

of the total volume which lies m the foh interval [4 (k - I )/«, 4/c/ii] by 
multiplying the width 4 /n of the interval by the area A(4k/n) of the right- 
hand face of the section cut by the kth interval (shaded in Figure 8) Adding 

FIGURE 8 


f(xt) 



498 / FUNCTIONS 


OF SEVERAL VARIABLES 



Example 3 The interated integral of the constant function /(*,/) = 1 
over a plane region B may be interpreted as the volume of a solid of unit 
thickness having B as its base. Alternatively, this is simply the area of B. 
For instance, the area of the region B shaded in Figure 10 is given either by 


or by 




clx 


(lx 



Ail 

3 




dt = 



~ [ (4 -t) l,2 dt 
Jo V 

= -3(4 - 0 3 ' 2 = S(4) 3 ' 2 = V 

JO 


FIGURE 10 



17.5 ITERATED INTEGRALS / 501 


In general, for each fixed value of x between 0 and 4, 

/■{4—r> l ' z 

A(x) =1 (4 — x — I 2 ) dt 

Jo 


(4 - x)t 
: 1(4 - x f 1 


l 3‘l(4-i) 1/2 


_ «T 

3 Jo 


The required volume is 


V = j g A(x) dx = ^ jj^ (4 x ~~ l 2 ) *] dx 

= [ 1(4 — x) 312 dx 
Jo 

= -3 1(4 - x) 5,2 T = § i(4)« 2 = 

Jo 


The integral 


r 4 r* j- |^^^—z) l l 2 -| 


appearing in Example 1 is called an interaied integral , or double integral 
Note that in computing this integral we first bold x fixed and integrate the 
function f(x, t) = 4 — x — t 2 as though it were a function of t alone 
(Compare with our procedure for partial derivatives ) The result is a function 
A(x) = 1(4 - x) 3/2 of x alone Integrating this function then gives the 
desired result 


Example 2 The volume V in Example 1 may also be computed by first 
holding t fixed For each value of / between 0 and 2 we obtain the area 

B( 0 = J o (4- x - l 2 ) dx 

” (4 - I 2 )* - = 4(4 - r 2 ) 2 

generated by cutting the volume by a plane perpendicular to the t axis 
The integral of B then gives the required volume 

V= f B(t)dt = f |(4 — f 2 ) 2 dt 
Jo Jo 

f 2 

= > J o (16 - 8/ 2 + r 4 ) A = * ( 16/ _ °P + 

= I (32 - °„ 4 + ¥) = -Vi* 

as before ► 


500 / FUNCTIONS OF SEVERAL VARIABLES 



The change of variable formula for one-dimensional integrals is 

ra(.b ) rb 

/ h(u)du = / h(g(x))g\x) dx (29) 

J 0(a) J a 

(See Formula (24) and the accompanying discussion in Section 14 8 ) For 
computational purposes we think of making the substitution u = g(x) so 
that du — g'(*)dx and the rest is just a matter of determining the correct 
limits for the new integral 

A careful reading of Section 14 8 will show that the computational shortcut 
of writing du ~ g'(x) dx is a device that happens to work for single-variable 
cases However, in cases involving more than one variable, this shortcut 
does not woik and , if used , will give entirely erroneous results What must be 
done is to use in place of the function g f (x ) appearing in (29) a quantity 
which generalizes the idea of derivative to functions which map n variables 
into n variables Let us look first at a particular case 

A point P in the plane may be located by specifying the horizontal distance 
x and the vertical distance y of P from the origin (0, 0) The point P may 
also be located by stating the angle 8 formed between the horizontal axis 
and a line from the origin to P, and the distance r moved along this line to 
reach P (see Figure 11) In terms of r and 6, we have 

x = / cos 8 and y = r sin 8 (30) 

The Equations (30) may be considered as defining a function, or trans- 
formation, F, which maps pairs (/, 0) of numbers into new pairs ( x,y ) 
The derivative , or Jacobian matrix , of this vector function 

( r cos d\ ( x\ 


is defined to be the matrix 



Note the procedure for obtaining the limits of integration If we first hold 
a constant and vary t, then for each fixed value of a between 0 and 2, the 
values or t range between 0 and 4 - x 2 (see Figure 10) On the other hand, 
if / is held fixed (between 0 and 4), then A- ranges from 0 to (4 — 0 In 
each case the values of the nonfixed variable range along that portion of the 
line, determined by the value of the fixed variable, which intersects the region 
of integration ^ 

Iterated integrals of functions of more than two variables may also be 
computed by repeated one-dimensional integration 


Example 4 In the triple integral 


fff 


( x 2 + y 2 + z 2 ) dx dy dz 


(28) 


the order of integration and the variables to which the limits of integration 
pertain are indicated by the order of writing the symbols dx , dy, and dz 
We first hold y and z fixed and integrate with x ranging from zero to y to 
obtain 

L L (t + + 22 *1) ** “ /« /„ fir + + * 2y ) dydz 

Next, z is held fixed and y varied from 0 to z, giving 

Evaluating this integral yields 

/oV'HtI-s 

as the value of (28) 


dz 


Improper iterated integrals are computed by utilizing the definitions 
already stated for the one-variable case 


Example 5 

li /„ f„ rc { ' 12 ” dr ‘ ,e = hj 0 (]™ /„ re-' 112 ' 7 ’ d^j dt 



502 / FUNCTIONS OF SEVERAL VARIABLES 


= I ► 



which was shown in Example 5 to have the value 1. The limits 0 to oo and 
0 to 2r for r and 6 in (34) are chosen so as to cover the same region (the 
entire plane) as that covered in the integral (33). ► 


In general, if we make a substitution like (31) in an integral 
jj ■ • • J g(y\ , y 2 , - • • ■ )'„) dyi dy > . . . dy n 

where R is the region over which the integral is to be evaluated, the new 
integral is 

ff f s s[/i(*i. • • • > *..) /»(ih ■ • ■ . *»)] \j\ dx 1 . . . d\„ 

Here S is the region in terms of the new variables a i, — , a„ which matches 
the original region i?, and J is the Jacobian of the transformation The 
absolute value of J is used to ensure that both integrals have the same sign. 

PROBLEMS 

In Problems 1-5, evaluate the indicated double integrals 


1 ,3 


xdydx 

3a 


-a 


(x -h y)dxdy 


dy d\ 


ki;j> 

n acosS 

r 


dx dy 


sin 0 dr dO 


6. Evaluate the improper double integral 

r r 

J i V- + J- 

In Problems 7-9, evaluate the iterated triple integrals 
7 

Jo Jl Jo 

8. f [ [ f -'v,Cn«rf l/llA;Al 

Jo Jo Jo 

9 . r rr \ *** 

Jo Jo Jo 

In Problem* 10-11, use ihe transformation r(r. 0) as darned in Example 6. 

_ ,..^J?^rT£RATED INTCGftALS / SC5 



The Jacob, an of the transformation F is defined to be the determ.nant 

(cosfl —r sm el 2 

J _ = / cos 2 e + r sin 2 9 = i 

J sin 6 i cos 6 j 
of the derivative of F 

In general, the derivative of a vector function 

(F 

r - 1 /2 


which maps points (vj, * 2 . , x») to points (y i, y 2 , 

the re if valued functions 


is defined by 


DF(x 1 , x 2 , 


, y n ) by means of 


y\ = /lOl, * 2 , 

,*») 


J>2 = M* l, *2. 

,*») 


yn = /»(*!, *2. 

,*») 



/•>/, 

r 

9X 2 

9fl\ 

dx n \ 


ba 

df2 

df2 

!> > *n) = 

ax. 

dx 2 

9x n 



9jn 

ML} 


U, 

dX 2 

9x„/ 


( 31 ) 


( 32 ) 


The Jacobian J of the function F is the determinant of the matrix (32) 
U is this Jacobian which takes the place of the derivative gf(x) in (29) 


Example 6 In order to evaluate the integral 

5 ‘- , " ,m ’ + * , dxdy < 33) 

we make the substitution x «= rcosO and y = r sin 0 The integrand 
c 11 2HxI+t ' I) is replaced by 

£,*“ Cl/2>ir* cos* 0+r s «in*fl) ^ __ (l/ 2 )r 2 

the quantity r being the Jacobian of the transformation (Compare with 
Equation (29) ) The integral (33) is equal to 

i f f rc-W'drdO < 34 > 

Jo Jo 


504 / FUNCTIONS OF SEVERAL VARIABLES 



10. Evaluate the integral 


IT 
ff 


(x + y )dydx 


11. Evaluate the integral 

r « 2 -* 2 > ,/2 

1 dy dx 

Jo Jo 

12. Find the area of the region R bounded by the parabolas /(x) = x 2 and 
g(x) ■ 9 - x 2 


13. Find the area of the region R bounded by y 2 — 2x 3 and y = 2x 

14. Find the volume of the solid bounded by the surfaces x 2 + y 2 = 9, 
z = x -f- y + 4, and z = 0 


15. Find the volume in the first octant bounded by the surfaces x 2 + y 2 = 4, 
s — x, and z ~ x 2 


16. Find the volume of the solid in the first octant bounded by the two 
surfaces z = 9 - x 2 and y = 9 — x 2 

17. Find the volume bounded by the surfaces x 2 + y 2 = z, x 2 + y 2 * 
and z = 0 


SUPPLEMENTARY READING 

Stem, S K , Calculus in the First Three Dimensions (McGraw - Hill Book Company* 
New York, 1967), Chapters 9, II, and 21-23 

Thomas, G B .Calculus and Analytic Geometry, Aihod (Addison— Wesley Publishing 
Company, Reading, Massachusetts, 1968), Chapters 15 and 16 


50S / FUNCTIONS OF SEVERAL VARIABLES 



PART IV 


Probability 




th Ingeneral a’ sampkspace !! for an experiment is a set having the property 
that to each’ possible outcome of the experiment there corresponds exactly 
“en^nt of 0 Since each outcome corresponds to exact* one i element of 

may represent more than one possible outcome 


Example 1 Suppose an experiment consists of tossing a nickel and a dime 
There are four possible outcomes 

(a) Both coins show heads 

(b) The nickel shows heads and the dime shows tails 

(c) The nickel shows tails and the dime heads 

(d) Both coins show tails 

A possible sample space for this experiment is 

a = {HH, HT, TH, TT} 

where the pairs of letters correspond, respectively, to outcomes (a)-(d) An 
equally valid sample space is 

T = {0, 1, 2} 


, . number of heads showing on the two 

in which the elements denote the nu fh! and (c) On the 

corns The symbol 1 S T -responds to both outcomes f<>r 

other hand, the set V - y both correspond to outcome 

this experiment since the elements HI In general, a sample 

(b), while outcome (d) fads to be of an 

space represents a classification, or partition, of the possmie 
experiment and one must be careful not to mix classifications 


Example 2 An experiment may have an 

outcomes In a study of rats trame ]R a starting box For a 

Graham and Gagne* measured the P possible experimental 

particular rat, any non-negative ‘ “alld sample space 

outcome (time value) and the set of e measured only to the 

However, due to measurement hm, nations, ■ J of „,e 

nearest hundredth of a second The sample sp 

elements 0, 0 01, 0 02, 

„ Acauisition, Extinction and Spontaneous 

•Graham, C H , and Gagne, R M. ™trZ q Jolinm l of ExpenmamJ PnchcloB 26, 
Kccovery of a ContlUionctl Operant Response, Journal j 
251-280 (19JO) 

18 1 SAMPLE SPACES / 511 


BASIC CONCEPTS OF PROBABILITY 18 


18 1 SAMPLE SPACES 

The theory of probability is a mathematical system that provides models 
for experimental phenomena exhibiting some measure of unpredictability 
The word ‘experimental is applied to any operation which generates 
outcomes, whether or not an observer exercises any measure of planning or 
control on the operation Many studies in the behavioral and management 
sciences are of this nature The economist studying the national economy, 
the sociologist investigating group behavior, and the psychologist conducting 
a learning experiment all observe behavior whose outcome is seemingly 
incapable of exact prediction By its very nature, then, probability theory 
has become one of the major mathematical tools used in the behavioral 
sciences Underlying any use of the theory of probability is a real or concept- 
ual experiment We begin by considering how experiments can be described 
mathematically 

Suppose we observe a rat running a T maze He can turn either left or 
right If we denote these outcomes by L and R, respectively, then each 
possible outcome of the experiment corresponds to exactly one element of 
the set S - [L R} This set of possible outcomes is called a sample space for 

c experiment We say a rather than the sample space since there are many 
*u>s or specifying the outcomes of this experiment If, for example, we are 
interested in the rat s behavior at the choice point as well as the choice made, 
wc might use 

T= {LH, LIV, RH, Riyy 

as our sample space Here LH indicates that the rat turned left after some 
hesitation L\V denotes a left turn without hesitation, and, similarly, for 
an This example is typical of experimental situations, in th3t 

510 / BASIC CONCEPTS O 


F PROBABILITY 



(c)£UF (d) E' 

(e) E f C\F (f) F' 

4. A firm wishes to buy three lots of material There are four domestic and 
five foreign suppliers from which to choose Write a sample space for this 
experiment How many elements denote purchases from one domestic 
and two foreign suppliers 9 

5. An experiment consists of observing two person communication links 
which are established among three individuals Once established, a 
communication is assumed to persist for the duration of the experiment 
Write an appropriate sample space for this experiment 

6. The sample space for Problem 5 contains eight elements How many 
elements would be required if there were 

(a) 4 individuals 9 

(b) 5 individuals 9 

(c) n individuals 9 

7. One method of estimating the reliability of a test is to split the test into 
two equal parts and obtain the correlation between the two parts 
However, as Kuder and Richardson* have argued, this method does not 
give a unique estimate, due to the number of different ways a test can be 
divided 

(a) Suppose that an experiment consists of dividing a 6 item test into 
two 3 item tests Denote a sample space where each element represents 
a unique division 

(b) Develop a general formula for the number of distinct ways that a 
test of 2n items can be split into two subtests of n items each 


18 2 EVENTS 

A sample space serves as the universal set for all probability statements 
related to an experiment The subsets of this universal set serve as mathe- 
matical representations of the various events which may occur in a perform- 
ance of the experiment For example, suppose an economist observes the 
fluctuations of the stock market on three consecutive days Since the market 
can either rise R or fail to rise Fort any given day, the set 

fi = {RRR, RRF , RFR, RFF, FRR, FRF, , FFR, FFF) 

is a possible sample space for this experiment For each outcome of the 
experiment we can determine whether a given event does or does not occur. 
If, for instance, we consider the event “the market rose at least twice in three 
da ys,‘* wc find that this event occurs if the experiment results in an outcome 
corresponding to one of the elements RRR, RRF, RFR, or FRR With any 

'Kudcr.G F, and Richardson, M W , “The Theory of the Estimai ion of Test Reliability,” 
P*)chometnka 2, 151-160 (1937) 


182 EVENTS / 513 



It is often useful to be able to count the number of elements in a sample 
space, even though it may be impractical to make a complete list of the 
individual elements In such cases we apply the basic principles of counting 
developed in Chapter 7 

Example 3 The Rich Employment Agency specializes in placing engi- 
neering applicants in one of 12 defined fields of occupational specialization 
From a pool of applicants, 6 engineers are chosen and their specializations 
arc recorded We want to designate an appropriate sample space for this 
experiment An outcome of the experiment is specified by a listing of occupa- 
tions of the 6 engineers Since each engineer is classified in one of the 12 
specializations, the most natural sample space contains 12° = 2,985,984 
elements, each element corresponding to a possible listing ^ 


PROBLEM S 

1. From a group of three men and three women, three people are assigned 

to a discussion group 

(a) List the 20 elements of the sample space ft where each element 
corresponds to a possible discussion group 

(b) Find the subset of ft containing those elements which correspond to 
discussion groups m which there are three women and no men 
How many elements are there in this set 9 

(c) Find the subset of ft containing those elements representing discus- 
sion groups which contain a majority of males How many elements 
are there in this seP 


2, Subjects are asked to place each of three foods in one of two preference 
categories A week later, the experiment is repeated using the same three 
foods Define a suitable sample space describing the choices made by a 
single subject Write the sets of outcomes in which 

(a) there are no preference reversals 

(b) there is exactly one preference reversal 

(c) there is at least one preference reversal 

3. Humphreys* has conducted a number of experiments in which a subject 
is asked to predict whether a light will be on or off Suppose that a subject 
is run through four trials Write an appropriate sample space ft for this 
experiment Let E be the subset of ft whose elements denote outcomes 
or which the number of “on” predictions is greater than one, and F the 
Subset whose elements denote outcomes for which the number of “on” 
predictions equals the number of “olT” predictions Determine the 
elements in the following sets 

<J) £nF (b)£nf 


’MosStrT P V, ' h , C f, L , G work conla.ncd m Bosh. R R , and 

1553) Stochautc \lalrl t for Uormng (John Wiley & Sons, Inc , New York, 


512 / BASIC CO 


NCEPTS OF PROBABILITY 



The event “demand exceeds supply and demand differs from supply by 
1.00 units’* is 

E n M = {(100, 0), (200, 100)] 

The event “demand exceeds supply and supply exceeds demand” is the 
impossible event <t> since E and F have no common elements ► 

Example 2 Returning to Example 3 of Section 18 1, let E denote the event 
that at least two engineers selected have the same occupational speciality 
Suppose we wish to find n{E), the number of elements in the subset E 
This number is not easily found directly, but /?(£) can be calculated indirectly 
by using Counting Principle CPI (Section 7 1) to obtain 

n(E) -f n(E') = n(Q) - 12 r 

Here E' is the event (that is, the subset of the sample space P) which is the 
complement of E 

Now «(£') is the number of ways in which the 6 engineers can be placed 
in 12 categories so that no 2 engineers share the same category Invoking 
Counting Principle CP 2 (Section 7 1), we conclude that 

/!(£') = 12 X U X X 7 

and therefore that 

/i(£) = 12° - 12 X 11 X X 7 

= 2,320,704 ► 


PROBLEMS 

1. Each of two judges ranks four subjects Determine an appropriate sample 
space Find the number of elements m the event ‘ the ratings agree for at 
least one subject ” 

2. Let A, B, and C be events of a sample space ft Using only the symbols 
U, f), and ' express the following statements in set notation 

(a) At least one of the events A, B , or C occurs 

(b) Only A occurs 

(c) None occurs 

(d) At most two occur 

(e) All three occur 

(0 A and B occur, but not C 
(g) Exactly two occur 

3. Observations arc recorded about three members of a control group and 
three members of an experimental group The* six observations are ranked 
in order of magnitude and the sum of the ranks assigned the experimental 
group computed Write the event “the sum of the ranks assigned the 
experimental group is 8 or less ” 


18.2 EVENTS / 515 



other outcome the event fails to occur. We have thus associated with this 
event the subset 

E = {RRR, RRF, RFR, FRR] 

of fi 

In a similar fashion we may associate with any event a collection of 
elements from the sample space For reasons of simplicity and mathematical 
precision, we simply equate an event with its corresponding subset That is, 
if Q is a sample space corresponding to a real or conceptual experiment, 
then an event is a subset of ft An event E is said to occur if the experiment 
results in any outcome corresponding to an element of E 
Since an event is a subset of the sample space ft, it follows that ft itself and 
the empty set <t> are events The event 12 is called the certain event, which always 
occurs, while 4> is the impossible event, which can never occur A subset 
containing a single clement of ft is called a simple event, while those subsets 
containing more than one element are compound events. 

Example 1 The XYZ Manufacturing Company maintains an inventory 
subject to certain conditions of supply and demand. Due to peculiarities in 
their production processes, the company can supply only 0, 100, or 200 units 
in a given time period Moreover, in the same period, demand may be only 
50, 100, or 200 units 
Let 

S2 = {(50, 0), (50, 100), (50, 200), (100, 0), (100, 100), 

( 100 , 200 ), ( 200 , 0 ), ( 200 , 100 ), ( 200 , 200 )} 

be the associated sample space, where the members of the ordered pairs 
represent demand and supply, respectively The event described by “demand 
exceeds supply” is the subset 

E = {(50, 0), (100, 0), (200, 0), (200, 100)} 
while the event “supply exceeds demand” is 

F = K 50 > 100), (50, 200), (100, 200)} 
and the event “demand equals supply” i S 

G = {(100, 100), (200, 200)} 

These events may not be the most informative partition of the sample 
pace or example, we may be interested in the amount of discrepancy 
between supply and demand In this case, to say that the event "the dis- 
crepancy between demand and supply , s loo units” occurs means that an 
experimental outcome has been observed which corresponds to one of the 
elements in the subset 

H = {( 100 , 0 ), ( 100 , 200 ), ( 200 , 100 )} 

514 / BASIC CONCEPTS OF PROBABILITY 



Example 1 Consider again the rat running a T-maze. Let us take 

a = {L, R } 

and define a probability function P on the subsets of B by 

p<q) = i, p({l}) = i, nm = i and m = 0 

It is easily checked that Axioms P1-P3 are satisfied. 

Note that any assignment of the form 

m = i. *{*■» = P’ P({R}) =l ~ p and m = 0 

where 0 < p < 1, represents a 

these restrictions, the axioms say nothing ^ 

assigned. 

Example 2 A favorite carnival game is 

board roughly 5-feet square marke niake! while if it f a n s 0 n a 

entirely within a small square, the play ’ 

line, the penny is tort. points indicated in Figure 1(a) 

Let us take as a sample space me set p f landing j n any 

and make the rough approximation ha P ^ a penny is j inch in 
region is proportional to the are f ro m any line if the player 

ttSSSZmSr rf' f ” - “ t 

probability of winning is (i) /0) Tb* 

Example 3 Suppose ^ 

£2 = {au ° 2 » • • • » a "f 

is a sample space containing n d'™"* J c °'/ c ~ cn \’ „ ' Writing p, = PCU 
be the simple event containing the g 



O 10 M 30 « jo W) 


IftJ AXIOMS Of PRO 


BABILITY / *17 



4. How many distinct events are contained m a sample space having 
(a) two elements 9 (b) three elements 9 (c) n elements? 

5. We are interested in the seating arrangement selected by five male and 
three female subjects doing a group task at a round table Write out the 
elements of the event “no two female subjects are seated adjacent to 
each other ” 

6. In a study reported by Sommer,* observers recorded the establishment 
of two person communication links among eight persons seated at a 
rectangular cafeteria table with one person at each end and three on 
each side Suppose that we wish to record only the first establishment of 
a two-person communication link Define an appropriate sample space 
and write out the elements of the events A, B, and C described, respect- 
ively, by the following 

A Communication occurs between adjacent persons on the same side 
of the table 

B Communication occurs between persons seated face-to-face 
C Communication occurs between adjacent persons seated corner to 
corner 

7. In Problem 20 of Section 7 2, suppose we add a chairman E who has 
veto power but no vote List the elements of the event “the chairman is 
pivotal ” 


18 3 THE AXIOMS OF PROBABILITY 

Probability theory begins with the notions of “sample space” and “event.” 
Once these are specified for a particular experimental situation, we then can 
proceed to assign probabilities to events in accord with the following axioms 
Logically, the meaning to be assigned to the terms probability and probability 
function is contained in these axioms and all other statements about prob- 
ability must be derived from them The examples and theorems of this 
section and the next illustrate the procedure to be used and contain the 
basic results 

Axioms for Probability Let ft be a sample space for some experiment 
and let E, (i = 1 , 2, 3, ) denote events (subsets of ft) A probability 

P is a function which assigns to each event E t c 12 a real number 
P{E t ), called the probability of E t , subject to the following restrictions 
PI P(E t ) > 0 for each event E, c ft 
P2 P(ft) = 1 

P3 If E 2 , E 3 , is a sequence of disjoint events (that is, E. n E, - <P 
whenever i ^ j) then 

♦Sommer, R 4 Studies in Personal Space, ’ Sociometry 22, 247-260 (1959) 

516 / BASIC CON 


CEPTS OF PROBABILITY 



Recall (Section 7 2) that the process of choosing a collection of elements 
from a given set is called sampling The set from which the objects are chosen 
is called the population Sampling may be done with replacement, in which 
case an object drawn from the population is replaced before the next object 
is drawn, or it may be done without replacement Changing the method of 
sampling often changes appreciably the probability of obtaining a particular 
result 

Example 5 Edwards* describes an experiment to test a farmer’s contention 
that he could detect the presence of hidden water by using a curved whale- 
bone The farmer was shown 10 covered cans and told that 5 contained water 
and 5 were empty He was instructed to divide the cans into two equal 
groups, one group to consist of all cans containing water, the other to 
consist of all empty cans We are interested in the probability that the farmer 
correctly places exactly k cans in the water group just by chance 
We may regard the cans the farmer places in the water group as a sample 
of size 5 drawn from a population of 10 items Since the cans are selected 
without replacement, the sample space 8 contains 

(!) 


elements, corresponding to the possible sets of cans chosen 

Let denote the event that exactly k cans are correctly placed in the 
water group These k cans may be chosen m 



ways from the 5 cans containing water Similarly the other 5 — k cans can 
be incorrectly chosen in 



Ways Therefore 



Assuming equal assignments of probability to the simple events, vve have 


Ptf*) = 



► 


’Ldnards, \V , "Probability Preferences in Gambling ** Amertaut Journal tf Psychology 
66. ( WJ) 


733 


AXIOMS OF PROBABILITY / SIS 



we must have p } > 0 (Axiom PI), and 

Pi+ P 2 + + pn = 1 

since = 5x U 5 2 U U S n is a partitioning of ft into disjoint sets 
(Axioms P2 and P3) 

Now let £ be the compound event {a ls a 2 , , cn}, where k <n Then, 

using Axiom P3, 

P(£) = P(Si US 2 U U &) - P(Sj) + P(S 2 ) + * + P(S*) 

= Pi 4- P2 + + Pt> ^ 

The argument in Example 3 is easily generalized to any compound event 
and we obtain the following theorem 

Theorem 1 If £ is an event containing a finite number of elements, then 
the probability of E is the sum of the probabilities of the simple events 
contained in £ ► 

It follows that once probabilities are assigned to the simple events, prob- 
abilities of compound events may immediately be determined For finite 
sample spaces, then, we shall ordinarily specify probabilities only for simple 
events and shall compute other probabilities by applying Theorem 1. 


Example 4 Take ft = (1, 2, 3, 4, 5, 6} as the sample space for the 
experiment of rolling a single die If the die is “honest,” we would assign 
probability | to each simple event The probability of the compound event 
& = {2, 4, 6} , described by “the outcome is an even number,” is 

P(E) = P({ 2}) + P({4}) + P({6}) = £ + £ + £ = £- £ ► 

If, as in Example 4, equal probabilities are assigned to the simple events, 
t en computing probabilities of compound events becomes a simple task 
Suppose 0 has n elements, so that each simple event carries probability 1/n 
to hnd the probability of a compound event £ containing k elements, it is 
necessary to add k probabilities, each of which is \/n Hence 

P(£) = <sH 

All we need to know in order to compute the required probability is the 
num er o elements in E The counting techniques discussed in Chapter 7 
are useful in this case 

The ass'gnment of equal probabilities to the simple events is frequently 
implied by the choice of adjective used to describe the experiment Terms 
such as an honest die,” “selected at random,” “according to chance," and 
so tcrth, will be understood to call for equal assignments of probabilities 

518 / BASIC CONCEPTS OF PROBABILITY 



choosing with replacement Hence, if we wish to minimize the probability 
of ascribing to the farmer any power to divine water when in fact he is just 
guessing, the experimental design characterized by sampling without replace- 
ment is the better test of his contention 

It will be helpful in motivating new ideas and applying theoretical results 
to interpret probability in terms of the concept relaULcfrequcmy Thus 
when we say that “an event E has probability p ,” we will take this to mean 
That if the experiment is repeated a large number of times, then the event £ 
is expected to occur in about (I00p)% of these repetitions For example 
If the probability is 0 6 that the toss of a coin yields i heads, then we would 

in our interpretation of relative frequency , , to serve as a 

While the concept of relative frequency is u , Ine Celta.nly 

substitute for the Probability Axioms, t e ana °SV Moreover every 

all relative frequencies are non-negative real^umbers f Moreover, ^every 

outcome of the experiment the expe r,ment and hence always 

certain event, occurs at each repetI ' „ lf P tw0 eve nts cannot occur 
has a relative frequency of unity y. ^ 1S the sum G f 

simultaneously, the number of occurrences :o j) Hence the 

the individual analogy ,s now 

same statement is true for the respet m 

complete 

Example? Suppose the experiment of tossmg a singk diesis repeated 
100 times, giving the results in Table 2 Let s. t . 


TABLE 2 




Number k 

1 

2 

3 

4 

5 

6 

Frequency of occurrence of k 

14 

20 

19 

13 

18 

16 

Relative frequency of k 

014 

0 20 

019 

013 

018 

016 


sample space for this experiment and I et ^> ^ , d j e the relative 

the event that the number i is u P p ^ r ™ h odd nU mber is obtained, 
frequency of the event O - U e ^te" = frequences 0 14, 0 .9, 
is 0 51, this being the sum of the respect ^^g 0 xhe relative 
and 0 18 of the disjoint events Eu 3 , either odd or a 4 is 

frequency of the event O U E< that d'c outcome wa u E a 

0 51 + 0 13 = 0 64 The relative frequency “ f ► 

is 0 14 + 0 20 + 0 19 + 0 13 + 0 18 + 

133 AXIOMS OF PROBABILITY / 521 


Example 6 Suppose now that we alter the experiment of Example 5 to 
the extent that the farmer is no longer required to divide the cans into two 
equal groups Instead, on each of five trials his task is to select a can containing 
water from the group of 10 cans After each trial, the cans are rearranged 
and the task is repeated This is an instance of sampling with replacement 
As before, we are interested in the probability that the farmer selects exactly 
k cans containing water just by chance 
Since the cans are rearranged but not removed after each trial, any one of 
the 10 cans may be chosen on each of the five trials Hence Q contains 10 
elements We can compute the number of elements in event Al as follows 
First, we must choose k trials on which correct choices are made This can 
be done in 



ways On each of these trials one of the 5 cans containing water must be 
chosen There are 5 fc ways since the cans are replaced after each trial 
Similarly, there are 5 5-fe ways of choosing an empty can on each of the 
remaining 5 — k trials Thus 

= (£) 5‘ S 5-t 


Under the assumption that all simple events are equally likely, that is, have 
the same probability, 


P(.A) = 



10 5 


The probability that the farmer correctly identifies k cans by chance under 
each experimental condition is shown in Table 1 ► 


TABLE 1 Probabilities of the events A, in Examples 5 and 6 


k 

0 

1 

2 1 

3 

4 

5 

Sampling without replacement 


0 099 


0 397 

0 099 

0 004 

Sampling with replacement 

0 032 

0156 

0312 

0 312 

0156 

0 032 


We see from the two previous examples that probability assignments 
differ according to the experimental design In these examples, we saw 
designs of sampling with and without replacement The difference of these 
two approaches is most marked for the events A 0 and A 5 , where the farmer 
is 8 times more likely to identify all five cans correctly by chance when he is 
allowed to sample the cans with replacement than when he samples without 
replacement Similarly, the probability of correctly identifying 4 or more 
cans by chance is 0 103 when choosing without replacement and 0 188 when 


520 / BASIC CONCEPTS OF PROBABILITY 








9. Each member of a group » required to nominate another member for an 
elective office We are interested in individual membership in n-cycles, 
an n-cycle being defined as a path from one member back to himself. n 
11 steps by means of directed cho.ccs (nominations) For example, 
group of 4, an individual can be in a^. 

j , or a 4 cycle 


2-cycle 


i 3-cycle 


sssszsssss:^ss:sb£ 

of a cycle of size 


(a) 2° 

(b) 3 9 

(c) A, where 2 < A < W 9 

10. What is the probability in Problem 9 that an 
member of a cycle of any size 9 


individual will not be a 


18.4 BASIC THEOREMS 

In order to facilitate consideration of ^PPh^ 1 '®^ of ^ most 

behavioral and social sciences, we 1 v mmsPl-P3 The proofs 

useful elementary consequences of th = ™ b ^ l y d te the fundamentals 
arc simple, but should be studied “refuUy as they no. ^ a samp , e 

of probabilistic thinking We assume r 8 specified consistent 

space SI and a probability function P have alreaoy nee v 

with the axioms 

Theorem 2 The probability of the impossible event is PM = « 

PROOF According to Theorem 1, Section 1 4, we have both 
S! U *> = ® and On*'* 

Axiom P3 applies, yielding 

1 = P(Q) = U <£) = ~ 1 

The result follows by subtracting 1 from each side of the equality 

, ~ r^mhnhilitv zero However, Example 

Thus the empty set must always «* ^ stateme nt 1S „ot true That is, 

2 of Section 18 3 shows that the con For m that example, a set 

a set haiing zero probability need no zero are a and h ence ze ro 

consisting of a single point is not emp y, 
probability 

, I A P(A') = 1 - PW The P robab,1 “ y 0f ,he 

Theorem 3 For any event A, f j b , ht of the event 

complement of an event is 1 minus P 

EOREMS / 523 


184 BASIC TH 




PROBLEMS 

1 . The Fry Company has 10 district sales managers each supervising 10 
salesmen What is the probability that a group of 4 randomly selected 
salesmen contains at least 2 who have the same manager? 

2. It is known that 45% of the 100 voters in a given precinct are Republican 
and 55% are Democrats From the list of 100 eligible voters, two people 
are selected at random and their political affiliation determined Find the 
probability that at least one of the persons selected is a Republican if 
the sample is drawn 

(a) without replacement (b) with replacement. 

3. From each lot of 100 items produced by a certain machine, a sample of 
size 10 is drawn Since it is unreasonable to require that each lot be 
perfect, it is decided to accept a lot if the sample contains no more than 
1 defective. What is the probability that a lot is accepted if it actually 
contains 5 defectives 0 

4. Three voters order 3 candidates according to degree of preference. If the 
candidates are randomly ordered by the voters, what is the probability 
that each voter selects a different preference ordering? 

5. Suppose that an examinee guesses the answers on a multiple-choice test 
of n items, each item having m alternatives. Assuming the guesses to be 
randomly allocated among alternatives, what is the probability of getting 
exactly r correct answers? 

6. From an operations research group consisting of 5 engineers, 3 econo- 
mists, and 4 mathematicians, a task group of 4 is randomly selected. 
Find the probability that the group will consist of 

(a) 3 engineers and 1 mathematician. 

(b) 2 economists, 1 engineer, and I mathematician. 

(c) 4 mathematicians 

7. Pascal and Suited* describe an experiment in which a professional 
graphologist is presented with pairs of handwriting samples from 10 
persons diagnosed as psychotics and 10 normal persons. The graph- 
ologist was able to identify correctly the sample handwriting of the 
psychotic in 6 of the 10 pairs presented. 

(a) What is the probability of correctly identifying 6 or more pairs by 
chance 0 

(b) How many pairs would have to be judged correctly in order for the 
probability of getting at least this many by chance to be 0 05 or less? 

8. Let the sample space H be partitioned into the disjoint events E, F, and 
G. Suppose that />(FUG) - 0.3 and P(E) = 2P(F) + 0.5. Find P(E), 
r(F ), and P(G). 

" TcslmB ,hc Ch,ms ° ra G ‘=[> h ° i »e“>-" ■'«"»' °t 

522 / BASIC CONCEPTS OF PROBABILITY 



9 Each member of a group is required lo nominate another member for an 
' elective otlice We are interested tn individual membership in n cycles, 
an h - cycle being debned as a path from one member bach to himself in 
„ steps by means of directed choices (nominations) For example, in a 
group or 4, an individual can be in a. 

J , or a 4-cycle 


2-c>cle 


, a 3-cycle 


(One-cycles are not allowed 

size N where nominations are made at random, an 

of a cycle of size 

(a) 2*» 

(b) 3-> 

(c) k, where 2 < k < W* 

10. What is the probability in Problem 9 that an individual will no. be a 
member of a cycle of any size 9 


18.4 BASIC THEOREMS 

In order to facl.tate consideration of applications of P^b.Uty mjhe 
behavioral and social sciences, we list in t is sec pi_p 3 The proofs 
useful elementary consequences of the Proba 1 1 y - , fundamentals 

are simple, but should be studied carefully as they indicate the fundame n ^ 

of probabilistic thinking We assume throug specified consistent 

space fl and a probability funetton P have already been speemeo 

with the axioms 


Theorem 2 The probability of the impossible event is P(4>) 0 

PROOF According to Theorem 1, Section 1 4, we have botl 
nu<i> = J! and P fl * = * 


Axiom P3 applies, yielding 

1 - P(Sl) = P(V U*) = F(S!) + W) _ 1 + 

The result follows by subtracting 1 from each side of the equ y 


► 


Thus the empty set must always have “e” 

2 of Section 18 3 shows that the converse p Qr m t h a t example, a set 

a set having zero probability need not <? emp are a and hence zero 

consisting of a single point is not empty, but it has 
probability 

p(AI) = i - P(A) The probability of the 
Theorem 3 For any event A, > of the event 

complement of an event is 1 minus th p 

BASIC THEOREMS / 523 


184 



PROOF From the Venn diagram of Figure 2 we see that 
A U A' = a and A n A' = $ 


FIGURE 2 



Since A and A' are disjoint, Axiom P3 applies, yielding 

1 = P(S 1) = P(A U A’) = P(A) + P(A') 
Subtracting P(A) from both sides gives the result 


Example 1 In Example 1 of Section 18 3 we must have 

P({R )) - l - 

whatever the value assigned as ?({!}) ^ 


Example 2 In Example 2 of Section 18 2, if the 6 engineers are chosen at 
random, then the probability that at least 2 have the same occupational 
speciality is 


But 


P(E) = 


i,(E) 2,320,704 

n(S!) 2,985,984 ~ 


»(£) = n(H) — n(E') 
so 


P(E) = ~ n (E') _ * _ n(E') 

»(n) u(n) 

= l — P(E') = 1 — - X 11 X X 7 
120 

« 0 78 ► 


Theorem 4 If A and D arc two events such that A c B, then P(A) < P(B) 
PROOF If A c B, then (sec Figure 3) 

B = A U (B n A') 

is a union of disjoint sets Using Axiom P3 

P(B) =* P(A) + P(B n A 1 ) 

S24 / BASIC CONCEPTS OF PROBABILITY 



FIGURE 3 



from which the result follows since P(B n 


A'), by Axiom PI, cannot be 


negative 

the following theorem 


Theorem 5 For every event A, 0 < P(A) < 1 
between 0 and 1 


All probabilities lie 


Theorem 6 Let A and B be any two events Then 

P(AuB) = P(A) + P(B)-P(Ar\B) 

PROOF A glance at Figure 4 shows that 

A U B = A U (B n A') 


figure 4 



VApicaSCb Si U its UK* 

^ U 5) = M + f(Jn ''' ) 

Similarly, B = (B (1 A) U (B n A') is a disjoint union, from which it follows 

that p(B) = P(B nA) + B( B nA ') 11 

Solving for P(B n A') in (2) and substituting into (1) y,.ds the desired 
result 

1S « BASIC THEOREMS / 525 




A geometric interpretation of Theorem 6 can be obtained from Example 
2 of Section 18 3 The total area covered by two overlapping regions is the 
sum of their individual areas minus the area of the overlap, since this was 
measured twice 

Example 3 Suppose that it is known that 60% of 500 male employees 
favor union representation and 30% oppose representation and of the 1000 
female employees 30% favor union representation and 60% are opposed 
The rest have no opinion We wish to know the probability that an employee 
selected at random will favor union representation 
Let il be the sample space Then «(fl) = 1500 Let M be the event a male 
employee favors a union representation and F be the event that a female 
employee favors union representation Then n(M) = 0 6 X 500 = 300 and 
n{F) = 0 3 X 1000 = 300 
Since M n F — <j>, we have 

P(M uf) = P{M) + P(F) 

— ?OQ i 300 _ 2 W 

1500 1 506 "5 r 

Example 4 Complexity of cognitive structure was investigated by Scott* 
using a technique in which a subject was asked to specify a number of 
objects and to group the objects into as many groupings as were meaningful 
to him Suppose that a given subject groups 15 objects into three groups in 
such a way that 

12 objects are placed in group I 
8 objects are placed in group 11 
8 objects are placed in group III 
6 objects are placed in groups I and II 
6 objects are placed in groups I and III 
3 objects arc placed in groups II and 111 
2 objects are placed in all three groups 

Then the probability that a randomly chosen object was classified in either 
group II or group III is 

/>(Il or 111) = nil) + nill) _ nil and III) 

= 1*1 + 15 - 15 = T5 ► 

It is useful lo extend Theorem 6 lo obtain a formula for the probability 
or the union of any number of events For n exents, E u E 2 , E„, we shall 

*0%:') " ’ COkn, "' C Com P!vxity and Cotmlnc Flexibility," Sociomary 25, 405-414 
526 / BASIC CONCEPTS OF PROBABILITY 



write s L to denote the sum of the probab.ht.es of all mtersect.ons obta.ned 
by taking exactly k of the events For example, for .. - 4, 

s, = ?(£.) + P(Ed + P(E 3 ) + P(.Ei) 

s, = p(£, n e 2 ) + P(£i n e 3 ) + p(Ei n e 4 ) + p{E 3 n e 3 ) 

+ p(£ 2 n Ed + P(E 3 n Ed 

S3 = P(£. n E 2 n Ed + P(£. n e 2 n Ed + £(£. nf.n Ed 
+ p (£ 2 n £3 n Ed 
s 4 = p(£. n £2 n £3 n Ed 

Theorem 7 If £., £ 2 , , * are any „ events and the numbers * are as 

defined above, then 

P(£. U £2 U uE n ) = s 1 ~s 2 + s 3 -s 1 + +(-» s » (3) 

PROOF We first consider the case when ,, = 3 Here Formula (3) becomes 

P(£l U £2 U Ed = Si - 52 + 

= P(£.) + P(Ed + P(Ed - P(Ei n £2) 

_P(£ 1 n Ed - P(E 2 n Ed + HE, n £2 n £3) ( 4 ) 

From Figure 5 we see that the union of 
into seven sections The probability of £1 - i— VII marked 

events and their intersections, can be written in terms 
on Figure 5 For example, 

P(£ t U £2 U £ 3 ) -PO) + «H)+ + * VU > 

P{Ei) _ P(ll, + P(1V) + W) + P(V1) (6) 

p(£i n £3) = p('H) + £ ( * v) 

I on valid we need only show that it is 
In order to show that Formula (. ) _ + s 3 effectively adds the 

equivalent to (5), that is, that the f °™ , 0 2 nce That this is so may be 
probability of each of the e\ent secti 

FIGURE 5 




■jg 4 BASIC THE 


OREMS / 527 




verified by substituting representations similar to (6) and (7) for each term 
in (4) Since section IV is contained in each of E U E 2 , and E z , its probability 
is added 3 times, subtracted 3 times, and then added once Probabilities of 
sections II, III, and VI are added twice and subtracted once, while those of 
sections I, V, and VII are each added once 
In the general case, we wish to argue in a similar fashion that the prob- 
ability of each section is added once in Formula (3) Suppose M is a section 
which is contained in exactly k of the n events Then the probability of M 
is added exactly k times in Sj— once for each event which contains M This 
probability will be subtracted out in s 2 once for each pan of events containing 
M — a total of 



times Similarly, P(M) is added once in s 3 for each of the 



triples of events containing M, etc Thus P(M ) is added a total of 

GHK)- 

times Putting x = 1, y = —1, and n — k in Formula (4) of Section 7 3, 
we see that this number is 1, as required ► 

Example 5 The reliability of ratings is frequently assessed by having two 
judges independently order the same set of n elements according to some 
specified attribute A match is said to occur if one element is assigned the 
same rank by both judges Since reliability (consistency) appears as a depar- 
ture from randomness, we are interested m the probability that matches 
occur by chance 

For simplicity, we consider the ordering established by one of the judges 
as fixed and assume that each possible ordering determined by the other has 
probability 1/n', where n rs the number or elements to be ranked Let 

E i ( J =1,2, , tt) 

be the event that the item ranked myth position by the first judge is also 
given rank y by the second judge, that is, that the two rankings match at the 
yth position Then the probability of the event that at least one match 
occurs may be computed from Theorem 7 as 

P( U,.) = P(E, U E z u U£„) = s,- S2 + I3 - + (- 1)-" 1 *, 

For each K, jj. denotes the sum of all probabilities corresponding to events 
that matches occur at A. specified positions, the other positions being un- 
restricted For any particular set of k positions, there is one way for the 
second judge to fill these positions in order to match the first judge, and 


528 / BASIC CONCEPTS OF PROBABILITY 



(„ - k)< ways of filling the other positions with the n -k remaining ranks 
Since there are 


different ways to choose k positions, it follows that 

n-ft) O-W'F-F 

1 


Thus 


P(M„) = 1 


I + JL 

2 ^ 3 ' 


• + (—0” 


. . 1 , fire! >1 4- 1 terms m the expansion of 
Note that 1 - P(M„) represents the first n + ' terms 

e -1 (see Equation (4), Section 16 1) Thus as it-* ' 


P(M„) -» 1 ~ 


> 0 632 


Among the more colorful vers.ons of the matching problem (Examp^S) 

is the following A group of n guests arriv decides to loin the party 

his hat Later the hat-check girl, being ore , another and 

= tiro.*, was - - - - - 

the ordering of « things established by J match P descrl p tl0 ns of objects 

ordering For example, a subject may a P_ ord of „ colors established 
with the objects themselves, or the pre t he same „ coIors according 

by a subject may be compared with the ordering o 

to degree of brightness * recurrent pattern in the methods for 

The preceding discussion exhibi ^ thc P proofs of , he theorems as 

attacking problems in probability WIonin g of an event was made 

well as in the examples, at sonic stag P which indicates 

and Axiom P3 was applied Smce this axmm^is die only ^ ^ opcrationj , 

how probabilities of different events c modus opcrmtdi for 

tool for probability theory It “OJ* e ™P " 0 ften be found 

problems ,n probability that die so ution^of -p" bjWuy 1S desired, 
by a judicious partitioning of essentially is the solution, in 

Indeed, in many cases the proper P J i> follows in a more 

the sense that once the partition is determined. 

or less routine fashion of ouf probIcms by forming an 

Thus, so far, vve have begun the of a p ., rmion appropriately 

event (set) statement, gcnsrall) m csscmu l and is so often missed. 

chosen The recognition of this approach is cssenna. 

*1 or an application of the nutchmu r*«*Jr “ of IV»onahl>." 

1* l_. "The Matching Method Arphcd to lnvestita 
ilu,!etni 3J, l-W-177 lW*6) 

ig< BASIC TI1EOIUMS / 520 



that the reader is advised to get it firmly in mind before proceeding To sum 
up, the steps in solving a probability problem are to choose a sample space 
and a probability function, to write an event statement linking events whose 
probabilities are known with the event whose probability is desired, and, 
finally, to compute the required probability 


PROBLEMS 

1. Two proposed bills are soon to be voted upon in Congress It is estimated 

that 70% of the Congressmen favor bill A, 40% are opposed to bill B, 
and 60% are opposed to at least one of the two bills A Congressman is 
selected at random What is the probability that he will favor at least one 
of the bills 7 

2. An experimenter observing the behavior of rats in a conditioning box 
has established three categories of interest E the event “rat remains 
relatively stationary,” F the event “rat engages in bar-pressing activity, 
and G the event “rat engages in exploratory behavior ” He assumes that 
E n G = <f> and that F n G = <t> and makes the following assignments of 
probability 

P(E U F) = 0 5 P(G U E) ~ 0 1 

P(E U G) = 0 7 P(E UFU G)' = 04 

Comment on the assignments of probabilities 

3. In a learning experiment, a subject is presented with a list of nonsense 
syllables After a short exposure period, the list is removed and the 
subject is asked to recall the list The procedure is then repeated, each 
repetition constituting a trial Let A k be the event that the subject never 
makes an error in recalling the list and learns on trial k (All correct 
responses prior to learning are due to guessing ) Suppose that P(Ad = 
(ga)* -1 g(l — a), where g is the probability of correctly reciting the list 
by chance, and 1 — a is the per trial probability of learning * Find the 
probability that the subject makes at least one error before learning the 
list (Assume that the probability of eventually learning the list is 1 ) 

4. La E and F be any two events Suppose that P(E) = 0 3, P(F) = 01, 
and P(E n F) = 0 1 Find the probabilities of the events 

(a) E' U F' (b ) E' \J F 

(c) ( £ nf)' (d) £nf' 

(O £' n F' (f) £'nf 

5. For any two events A and B, show that 

P[(/l n B') U (B n A')) = P(A) + P(B) - 2 P(A n B) 

Give a verbal description of this event and contrast its probability with 
the probability of the event A U B 

•Durscd from the one-element model or paired associate learning For a discussion sec 
Atkinson, R C , Bower, G H . and Crothcrs, E J . An Introduction to Mathematical 
Learning Th,ory (John Wiley & Sons, Inc , New York, 1965), pp 84-108 

530 / BASIC CONCEPTS OF PROBABILITY 



. A subject is asked to match each of k facial photographs w.th a cor- 
responding emotional state 

(a) What ts the probability that he correctly matches one or more 
photographs by chance when 

(i) k = 3 <"> k = 4 

(...) k = 5 (tv) k = 6 

(b) Compare the probabilities obtamLd m part (a) w.th the limiting value 
(b) Compare the P ue ^ ^ ^ > 5 . (he probablI ,ues are essentially 

independent of k (See Sect, on 16 1 ) (Is lh,s a surpnsmg result ) 


7. Use mathematical induction to prove that 

P( El U£eUE,U ua = «£,) + «+ +Pm 

if the E, are pairwise disjoint (that is, E fl E, - <t> when , v 4 j) 


8. Prove that for any events E i, Ea, 


P(EiU£eU UH)<TO) + ® + +P(E) 

9. Prove that P(F n £') = F(F) - F(£ <"> F) 

10. Prove that for any two events E and F, P(E) = P(E n F) 1 nE n 5 


18 5 CONDITIONAL PROBABILITY 


Suppose an experiment is performed ^iTaddftional information 

of an event F Suppose further th | r even , £ occurre d How is 

about the experiment, specifically, tha helpful to consider 

the probability of F affected by this /J Zbve grounds 

first examples m which this ^"“"^"Xmulate a precise math- 
The answers thus obtained will then 
ematical definition 


Example 1 A card is drawn ^'"te^indmduat cards and assign 

SI consist of 52 elements correspond g same probability The" 

probabilities so that each sim P le eve a heart" has probab.l.ty P(E) = i. 
the event E described by the card dr while the event G “the card 

as does the event F “the card drawn .saspade^ while the 

drawn is the seven of hearts” has P ro a 1 1 y . t that IS suppose we 
Now suppose we learn that the care I drawn is a heart, tha^ ^ 

are somehow informed that event Fh obab i!ity zero Event G, on the 

impossible and should reasonably cjr i^ 1 ; which, from the information 
other hand, becomes one of only ' ^‘^ '' ^ should now have P rob- 
available, should still be equally likely, and ncnc ^ 

ability 

,85 CONDITIONAL PROBABILITY / S31 



Example 2 A com is tossed three times Let us take 

= {HHH, HHT, HTH , HIT, THH, THT , TTH , 77T} 

denote the simple events by Ei = {HHH}, E 2 = {HHT}, , Fs “ 
{77T}, and assign probabilities 4, J, iV> s> respectively, 0 

these simple events Let 

£ = {HHT, HTH, HTT, THH, THT, TTH) = E 2 U £3 U U £7 

be the event described by “at least one of each face occurs and 

F = {HHH, HHT, HTH, HTT) = £1 U £2 U £3 U £4 

the event “the first toss is heads ” On the basis of our assignment of prob- 
abilities 

P(E) = f and P(F) = U 

If, now, the experiment is performed and we are told that at least one of 
each face turned up, what is the probability that the first toss was heads 7 
We reason as follows We know E has occurred, but have no other infor- 
mation From the initial assignment of probabilities, it is more likely that 
£ 2 = {HHT} occurred than that £ fl = {THT} occurred Indeed, in the 
absence of other information and in keeping with our original assignment of 
probabilities, the occurrence of £ 2 is 4 times as likely as that of E G Thus the 
new probabilities of £ 2 and £ c (and of any other pair of events contained 
in E) should be in the same pioporiion as the old That is, there is a constant 
of proportionality c such that each new probability of an event in E is 
obtained by multiplying the corresponding original probability by c Denoting 
the new probabilities by Q, this means that Q(£ 2 ) = cP(E 2 ) = c(£), 
Q(£o) = cP(E g ) = We may solve for c by noting that since £ 

must occur, its new probability Q(£) should be unity Thus 

l = Q( E ) = cP(E ) 
or 

C = W) 

We now assign new probabilities so that 

(i) each simple event in £ is assigned its old probability divided by £(£), 
(u) each simple c\ent in E' is assigned probability zero 
Returning to the original question, wc have 

= Q({HHH))+ Q({HHr))+ Q{{HTH }) -f Q({HTT}) 

= 0 + wurr}) P({HTH}) puhtt » 

P{E) ^~pw~ + ~pm~ 

= WHIT, HTH, HTT}) P(E n F) 

£(£) />(£) 

532 / BASIC CONCEPTS OF PROBABILITY 


(B) 



For our example, 


~ P(E) 


P( E n F ) = * = 7 


10 


Equation (8) provides a formula for revising original assignments of 
probabilities when it is known that an event E ^ occurred For any event 
F such a revised probability is called the condmonal probab.hty 
and is denoted by P(F \ E) Note that since E n F £ E, and thus by Theorem 
4, P{E nf)< F(E), the formula 


, „ p(£nf) 
P(F | E) - ;)(£) 


O) 


always gives a number between 0 and 1 Thus ,t makes ensetocal P f ]£) 
a probability We note that Formula (9) ,s meaningful only OHR> 0 
If W, _ o conditional probabilities given E are not defined However, (9) 
u, conuiuoi v ove no elements m common For in this 

makes sen™ when Butt* cons , stent wlth caS e („) of Example 2 

Example 3 In a study by Hake on 

whether a horizontal H or vertical V row g of .l. stimulus 

the next trial The -“(K K) = »« for one group and 

presentations so that P(H\n )- i > words the 

PfH Im-no and P(V I K1 = o 7 for another group In other words , me 
P(I1 1 H) = 0 9 and r(V \ v) , repea ted on the next trial 

probability of the stimulus events H or o ^ 

differed for the two groups 

If in Equation (9) we think of keeping afn^fanchonP? \E ), railed the 
subsets of the sample space 9, we ° (o £ w j, lc j, assigns numbers to 

condmonal probab.hty funcuon corresp d * ltself a probability 

rcbon^T" tV-y conditional probab.hty function 

satisfies the basic Axioms P1-P3 

Theorems Let £ be an event having positive probability Then the 
function P( | E ) defined for each event F by 

. x P(EnF) 

P(F | E) - 

(Equation (9)) is a probability function 

PROOF (1) P(f\ E) = HE n F)/P(E) > 0 for all F since P(E) > Y 
hypothesis and P(F 

(2) P(p. | £) = £(£ n <T)/P(.E) = P(E)/P(E) = 1 

„ r thr* Statistical Structure of a Random 
•Hate, H W , and Hjman R . ' } «, 61-74 (1953) 

Senes of Binary S>mbols, Journal of E- pc 

, a5 CONDITIONAL PROBABILITY / 533 



(3) If the events F 2 , F 3 , are disjoint, then so are the events F t n E, 
F 2 n E, F, n E, , and we have 


»(Cu>i e) 


p[fin(u F„)] p[0(£nF„)] 


P(E) 


P(E) 


E P(EnF„) 

n I P(E) 


f(£nf.) 


E P(Pn I £) 

n=l 


► 


As an immediate consequence of this result, it follows that the function 
P( | E ) has all the properties derived in Section 18 4 for probability functions 
In particular, corresponding to Theorems 2-6, we have the following theorem 


Theorem 9 The following statements are valid for conditional prob- 
abilities 

(I) P(* 1 £) = 0 

(it) F(F' | E) = 1 — P{F [ £) 

(ill) If F c G, P(F | £) < P(C 1 E ) 

(tv) For any event F, 0 < P(F [ E) < 1 

(v) P(FuG|£) = P(Fj£) + P(G|£) — P(F n G | £) ► 


PROBLEMS 

1. A state legislature is composed of 60% Republicans and 40% Democrats 
It is known that 35% of the Republicans and 60% of the Democrats 
favor a pending piece of legislation What is the conditional probability 
that a legislator who favors the legislation is a Republican 7 a Democrat 9 

2. The Keep Cool Company specializes in the sale of air conditioners 
It finds that the seasonal demand for its product varies depending upon 
whether the summer is abnormally cool, average, or abnormally warm 
On the basis of past experience, they assign the following conditional 
probabilities to the various states of demand 


demand (number of units) 


0-100 100-300 300-500 


xi cool summer 

08 

02 

0 

*2 average summer 

0 1 

05 

04 

x j warm summer 

0 

04 

06 

Suppose that P(\ :j) = 0 2, P(x>) 

= 0 5, and P(x 3) 

= 03 The company 


wishes to stock only enough units to insure an adequate supply to meet 
the upper limit in the demand state most likely to occur If air conditioners 
can be purchased only in 100 unit lots and the company wishes to order 


534 / BASIC CONCEPTS OF PROBABILITY 



early for the purpose of taking advantage of preseason price discounts, 
how many lots should they order 1 

3. It is known that in an unincorporated community of 100 families, 60% 
favor incorporation and 40% are opposed A sample of 10 families is 
drawn and their preferences noted 

(a) Find the conditional probability that the fifth family sampled favors 
incorporation given that 7 of the 10 families in the sample favored 
incorporation 

(0 if the sample is drawn with replacement 
(h) if the sample is drawn without replacement 
How do you interpret these results 0 

(b) Show that if a sample of size «, whether drawn with or without re- 
placement, contains k families who favor incorporation, then the 
probability is k/n that any particular family sampled favors in- 
corporation 

4. Consider a family which has two children Assume that each child is 
equally likely to be a boy or a girl If we know that the conditional 
probability is \ that both children are male given that the older child is 
male, what is the conditional probability that both are male given that at 
least one of the children is a male 9 

5. From a three-man interview team, one man is to be selected at random 
for a special assignment One of the members, having some familiarity 
with probability, reasons that the probability of his being selected is § 
He asks the boss to tell him which one of his fellow members will not be 
chosen, claiming that since he already knows that at least one member 
will not be chosen, there can be no harm in divulging this information 
The boss refuses on the grounds that if our probabilistically inclined man 
knew which member would not be chosen, then the probability of his 
being chosen would increase to since he would be one of two members, 
one of whom would be chosen for the special assignment Show that the 
probability is still £ even if the boss answered the question, providing that 
in the event that our inquisitive man were to be selected the boss is just 
as likely to name one of the remaining two members as the other 

6. Suppose that a population* of applicants for entrance to a private 
College, where all applicants take a certain scholastic aptitude test (SAT), 
is divided into four categories vi, high school graduate, SAT above 70, 
re, not high school graduate, SAT above 70, high school graduate, 
SAT below 70, and xj, not high school graduate, SAT below 70 Assume 
on the basis of past experience that for a given applicant selected, 
PCx*) = 0 60, F(t 2 ) = 0 15, P(\S) = 0 0a, and P(x 4 ) = 0 20 Suppose 
further that each applicant is interviewed and a decision made to accept 
y l or to reject >2 the applicant Let the tabulations of the interviewees* 

* Example adapted from Cronbach, L J , and Glescr G C Psychological Tests and 
ersoiuiel Decisions (University of Illinois Press, Urbana, 111 , 1957) 


185 


CONDITIONAL PROBABILITY / 535 



conditional probabilities P(y, [ *,) be as follows 
decision 


category 

accept 

reject 

*i 

0 80 

0 20 

JC2 

0 80 

0 20 

X\ 

0 20 

0 80 

Xl 

010 

0 90 


Find the array of conditional probabilities P(x, \ y,) where each P(x% \ 
denotes the probability that the applicant belongs in category x, given the 
result (y } ) of the interviewer’s decision 

7. If B is an event with positive probability, show that for any event A 

(a) A C B implies P{A | B) — P(A)/P(B) 

(b) BQA implies P(A | B) — 1 

8. Prove Theorem 9 directly from Formula (9) without using theorems 
established in Section 18 4 

9. Prove that if F and G are any events, then 

P(FnC| E) < nflE) < P(ruc|£) < P(F\ E) + P(G\E) 

10. Prove the following statements 

(a) P(E| E) = 1 if P(E) > 0 

(b) If P(E) = 1 and F is any event, then P(F | E) = P(F) 

(c) If P(E) > 0 and E and F are mutually exclusive, then P(F \ E) = 0 

(d) P{F | £') = [P(F) - P(£n F)]/[l - P(£)] 

11. Suppose that E and F are two events, each with positive probability 
Show that only one of the following statements is, in general, true 

(a) P(E 1 F) + P(£' \ F) = 1 

(b) P(E|P) + P(£'[F) = 1 

(c) P(E | F) + P(£ [ F’) = 1 

12 Let £ be an event having positive probability and suppose that events 
Fi.Ps.Ei. partition the sample space Si Prove that P(P« | P) ~ 1 

(a) from basic principles 

(b) using Theorem 8 


18 6 APPLICATIONS OF CONDITIONAL PROBABILITY 

The definition (9) specifies P(F \ E) in terms of previously assigned prob- 
abilities P(E) and P(E n F) In practice, however, it is often more natural to 
assign conditional probabilities first and to use these to specify the probability 
function P This is done by solving (9) for P(E n F) to obtain the following 
theorem 


536 / BASIC CONCEPTS OF PROBABILITY 



Theorem on Compound Probabilities For any events £ and F such 
thatP(£)>0 P(EnF) = P(E) P(F\E) HO) 

The probability of £ and F ,s the probability of £ times the conditional 
probability of F, given E 

Examnle 1 Two balls are drawn at random and without replacement from 
an urn* containing two red balls and three green balls What is the probability 

that a red ball is drawn first followed by a green ball assigned 

Since draws are to be made “at random ” equal ptobaMU set i an ass gned 

to each ball in the urn The probability of obtaining a red baU on Jhe first 

draw is thus £(£,) = | 3 onaiT orobTbfhty’ of green on the second 

balls remain so that the (conditional) p y ,mnlv that 

draw would be P(G 2 | *x) = 5 These ass Ig nments together imply that 

P(« I nG 2 ) = £(£i)£(Gel B >) = « * 

The Theorem on Compound Probabilities may be multiplying 
rule for computing the probability of any intersec 
conditional probabilities 

Theorem on Compound Probabilities 0 °"" n £„-t) > 0 

be any collection of events for which P(£ 1 n £2 n 

Then 

P(£, n £ 2 n n£») 

= P(£,)P(£ 2 I £i)£t£a I E < n £a) P(E„\EinE 2 n n 

PROOF When,. = 2, (l 1) becomesPf£i n '> 2 ’ Using 

^r^^lnwehave 


P(£in£ 2 n n£tn£.+i) 

= P(£. n £ 2 n n £0£(&+ 1 1 n 


n £i) 


, j,„p if n n £.) P roduces the required 
The inductive assumption applied tor( 1 y. 

result 

Example 2 The one element learn.ng 

,2C the learning of a simple stimulus respo ates— C associated 

that on each trial the stimulus element is m one of two 

tp first considered in 1 8th and 19th 
•Examples of the ope ducussed m this Ejects .s domed from Irad.uonal 

century writings The use of urns and olhe 
terminology 

/.rtunlTlON AL PROBABILITY / S37 

186 APPLICATIONS OF CON 



with the correct response or C not associated with the correct response If the 
element is in state C on the zth trial, it is assumed to move to state C on the 
(/ -J- l)st trial with probability p Once the element is in state C, it is assumed 
to remain there indefinitely At the start of the experiment, the stimulus 
element is assumed to be in state C Symbolically, 

P(C0 = 1 P(C t+1 ) | C.) = p and P(C J+1 \ C.) = 0 (12) 

We are interested in the probability of the event C„ that the subject is in 
state C on trial n Since on any trial, the stimulus element must be either in 
C or C, P(C n ) — 1 — P(C„) and we need only derive P(C„) 

In order for an element to be in state C on trial n, it must have been in 
state £ on each of the proceeding n — 1 trials, that is, 

c n = c,nc 2 n n c n ( 13 > 

Applying the Theorem on Compound Probabilities gives 

p(c„) = P(e 1 )f’(C2ie l y > (e3|Cinc 2 ) />(c„|n 1 nC 2 n nC„_0 

Using (12) and (*3) this reduces to 

P(C„) = P(C,)P(C 2 1 C,) P(C„ I 5„_.) = (1 - pf-' 

so that 

P(C„) = t - (1 - p)”-> ► 


Example 3 A member of Congress participates in successive roll call votes 
on an issue Suppose that if he votes in favor of the issue on the zth roll call, 
e\ent F„ he will vote favorably on the (z + l)st roll call with probability 
1 - p If he votes against the issue on the zth roll call, event U t , he will vote 
favorably on the (/ + l)st roll call with probability p If the Congressman is 
initially in favor of the issue, what is the probability that he will vote favorably 
on (a) the fourth roll call vote*? (b) the /zth roll call vote 7 

The solution is as follows (a) Let Pn = P(F n ) denote the probability of 
voting favorably on the zith roll call For the case zi = 4, the problem may be 
solved with the aid of the tree diagram in Figure 6 The entries p and 1 — P 
indicate the appropriate conditional probabilities for the various branches 
According to the Theorem on Compound Probabilities, the probability for 
any path is found by multiplying the composite branch probabilities The 
probability P(F 4 ) of a favorable vote on the fourth trial, given a favorable 
position initially, is the sum of the probabilities of all paths ending in Fon 
trial 4 Thus 


p* = P(F 4 ) - (1 - p )* + p* + 6(1 - p ) 2 p 2 

(b) For the general case, we note first that, regardless of the particular 
sequence of votes cast by the Congressman, he must vote either for the issue 
or against it on the zith roll call This means that 

Pn U (/„ = n 


538 / BASIC CONCEPTS OF PROBABILITY 



FIGURE 6 



and thus 


F„+, = fn+1 n£2 = ?»+> n (F» U V„) 


(14) 


= (F„ +1 n F„) u (F»+i nW 

follows from the distributive law for sets 
Writing probabilities m (14) gives 

p„ + , = w + .) = w- n f - +,) + f(W " " ' ' " +,) 

= F(F„)P(F„ + , | « + P(^)P(F„ + , I ^ 

= p(F„)0 —p) + KUJP 

Since ?((/„) = 1 - F(F„) - 1 - P- lh,s nuy b<; " T ' ltCn m l '"' f ° rn ' ° 
recursive relation as 

- p . 0 - F) + o - P- )n 

= (1 - 2p)p« + P 

or as a difference equation 

(A -h -/>)/>« * P 

mTinNAl PflO B ABILITY / 539 

186 APPLICATIONS OF CO 


P»-r * 



Solving this equation we find 

/?„=(!— lp) n p 0 + 


1 - (1 - 2 p) n 
2 


For large n, p n is close to | regardless of the initial probability po, a fact 
which may help explain the standard parliamentary device of minority 
delaying tactics ^ 


Example 4 Polya Urn Model* An urn initially contains / red and g 
green balls A ball is drawn at random It is replaced and c balls of the same 
color are added to the urn (c may be zero, positive, or negative) A second 
ball is then drawn, replaced, and again c balls of the color drawn are added 
to the urn The outcome of drawing a red ball on the /th drawing is denoted 
by R„ and the outcome of drawing a green ball on the /th drawing by G, 
This process may be continued indefinitely 
The addition to the urn of balls of the same color drawn provides a rough 
model of phenomena where the occurrence of one case increases the prob- 
ability of further cases For example, this model could be used to describe 
the spread of rumors or contagious diseases The Polya urn model becomes 
the statistical model of sampling with replacement from a finite population 
when c = 0, and it becomes the sampling without-replacement model when 
c — — 1 Thus, we see that the Polya urn model is a versatile model and has 
many applications 

We shall consider various aspects of the Polya urn model and leave others 
as exercises 

(a) The probability that the second ball is green G 2 , given that the first 
ball is red R j, is 


P(G 2 1*,) = 


g_ 


r+g-\-c 


For, if the first ball is red, the second drawing is made at random from an 
urn containing r + c red balls and g green balls 

(b) The event “red on draw two" can occur following either red or 
green on draw one Symbolically, 


P 2 = (« i ni? 2 )U(G, nP 2 ) 
(see Figure 1) so that we have 


P(P 2 ) “ P(Ri n R 2 ) + P(G X n R 2 ) 


= P(R j)P(i? 2 \Ri) -f- P(G 1 )P(R 2 | Gj) 


*his tea?hm?r-frSr a ( b ° r , n m B ^apest, but who has spent most of 
ms teaching career in this country at Stanford University 


540 / BASIC CONCEPTS OF PROBABILITY 



FIGURE 7 



Arguing as in (a) gives 

- rh 


r + c 

r +T+^ 


' r + g r+g + c 


r + g 

and we find that P(Rz) “ P(Ri) . _i Suppose that among 

Let us examine what this means for , , t are the same 

50 new cars there is one “lemon Your c Problem 12, we find that 

whether you are the first or thesecon cus o^ , Fof (h( , present s ,tuation 

in the Polya urn model, P(R.) - H . lrc j<j y been sold, your chance 
this means that no matter how many c t l, at you ar c ignorant 

of being unlucky ,s still 3 V This ^f^^^^y bcen sold prior to 
of which car is defective If the defc wh ,ic , f lt rcm ains, your 

your purchase, your chance is reduce • j n ,i 1( . absence of 

chance has increased over that of the in oosstbiUtics ofTsct each other 
precise knowledge, the probabilities of these two possibilities 

Zlt, from «-» "1 

( 9,) , we have ^ P^Ml 

P(.R, | Rz) = ~~p(Rn) = 

but from (b), P(R i) = W*e> and thus 

r 4- c 

P(R , | RS) = n R -- 1 K " = r + g + c 

. „n since we might be tempted to reason 
This appears to be a surprising res greeted by Rz ” 1 loader, 

th it occurs berorc Rz and hence canm, Jx . . ^ ^ ^ for , my 

it is easy to see th it this is/u/se reasoning 
events R, and Rz having positive prob ibiliucs, 

, 1 ,, » A piR ), then P(R i 1 R.I ' 1 

I'rom this ll follows that if I (Rz I ' * Ill4 £ and F. Note dial 

Obviously this argument ipphes to any ’ ocnl s E ai J / for 

the condition ll probability Pit i *> » J ‘ h ' 


1*« A P P LIC AT I 


, OHS or common- puouauu.ty / 




which P(E ) > 0 There is no requirement that F follow Em a time sequence, 
our interpretation notwithstanding In fact, there is no temporal element in 
probability, an event being just a subset of the sample space 
To look at the relation between P(R\ | R 2 ) and ^(^2 i -^ 1 ) another way, 
suppose we draw without replacement from an urn containing one red ball 
and three green balls What is the value of P(R 1 | -R 2 ) 9 Obviously the answer 
is zero, for if the second ball is red, the first cannot be Here the result of the 
second draw affects our assignment of probabilities to the first draw 
Equation (15) shows that this also happens in less obvious cases 
Finally, let us return to Example 1, since it is a special case of the Polya 
urn model, m which P(Ri j R 2 ) = P(R 2 ! ^ 1 ) = i This result may be 
interpreted m a relative-frequency sense as follows Consider the experiment 
of drawing twice from the urn and repeat this experiment many times over, 
say 10 million times Since P(R\ n G 2 ) = P(G 1 n^a) = P(Gi n G 2 ) — 
and P(Ri n R 2 ) = 1 * 0 > then according to the relative-frequency interpreta- 
tion (Section 18 3), there will be roughly 3 million each of outcomes Ri n G 2 , 
Gi nR 2 , and G\ n G 2 , while J?i ni ?2 occurs about 1 million times Now 
let us consider those outcomes, of which there are approximately 4 million, 
in which red appears on the second draw Of these, red also appears on draw 
one in about 1 million cases, or about one quarter of the total, agreeing with 
our first calculation ► 


We conclude this section 
used in applications 


with two simple theorems which are widely 


Theorem 10 Let events E u E 2 , E 3j be a partition of ft, that is, the 
events E . are disjoint and (J £, - ft Let A be any event Then 

PW = £ P(E t )P(A \ E t ) (1 6) 

PROOF We have 

^ = ^nG = ^ n (u£.) = UWn£ t) 

rV h 7 i,S ‘? b ““ Ve 'T f ° r SC,S S,nce thl = E - ™ disjoint, then so are 
A n E u A n E 2 , and thus 

f(-0 = f[UWnE,)]= z n £,) 

Wrmng P(A n £,) = P(E,)P(A [ £,) completes the proof ► 


Theorem 11 Bayes' Theorem* Let E\, E 2 E 
be any event Then, for each k, ’ - > ^ - 


partition ft and let A 


P(E k | A) *= ^R(Ek)P(A | 

ZWi4|£.) 

•After the Reverend Thomas Bajes (1702-1761), an Engltsh mathema.tcan 


M2 / BASIC CONCEPT 


S OF PROBABILITY 



PROOF 


The result follows mrmedtately by subst.tut.ng Formula (16) into 


P(E k | A) = 


p(A n £V> 
P(A) 


P( EQP(A | EQ 
P{A) 


Example 5 Suppose that two classes of 

say schizophrenics and neurotics It has been 1S par . 

effective in the treatment o ^ schizoph iren h ex P y dlent that a patient 

Ocularly effective in the treatment of neurot cs P confinement In 

be correctly classified since a misc : a ^ JS admims tered to entering 

order to aid in classification, a i psycho l g 1 the neurotlcs and 60% of 

patients Past experience has show , th a ^ an arbltrary cutting 

the schizophrenics pass the test, tha ■ q „ rnuD roughly 40% actually 
score It is also known that in an undiagnosed group, roughly Wo 

are schizophrenic and 60% are neur0 “ c one natient is selected at random 
From a large sample of incoming p > h probability that 

and given the test If the patient passes the test, what P 

he actually is a neurotic' 1 neurotic and E denote 

Let £i denote the event that th^P^e^^e.^i^ 1 ^^^ 1 ^ ^ ^ ^ 
the event that the patient passes th 

reqUlreS rrb (14 PIE I Ei) = 02 and P(E | E[) = 0 6 

P(E ,) = 06 P(E[) = 04 P(E I Ei) 

From Theorem 11, we have 

P(EA X P(E 1 EiJ 

p(£l > £) = T^nFiE^+nE^ l £i > 

0 6X02 = ’ 

= (06lTO2)r+loTxO6) 

Given the information that a P atie " ‘“““'"co^pareTwith an unconditional 
of his being a neurotic is reduced to » patient passed the test, the 

probability of 0 60 Hence, if it is kn °" h , c and should therefore be 
odds are 2 to 1 that the patient is a schizophrenic ► 

given drug therapy rather than psyc iot erap 

« Rives Theorem The events Eu Ez* 

This example illustrates the use 3 assumed to partition ft 

£ 3 , m Theorem 11 are the hypothesis E k 

The probability PfE k ) is called the « £ , 0 „ probob.hly of the hypo- 

The probability P(E k \ E ) 1S lerme , h hypothesis with the highest 

thesis Ek Given that an event £ has occurr , * be the mos t plausible 

a posteriori probability of occurrence is a 
explanation accounting for the event 


problem s 
1 

are 


LEMS « nft_mDlov«.d people, 5% of whom 

■ A questionnaire ,s subm.tud to a f abortrSi 30 % nk.llcd laborers, 

are unskilled laborers, 10 y o 

un.TlOKAL PROBABILITY / 543 

186 APPLICATIONS OF CON 



30% clerical workers, and 25% managerial and professional workers. 
Fifty percent of the unskilled, 40% of the semiskilled, 45% of the skilled, 
30% of the clerical, and 20% of the managerial and professional workers 
report that they are dissatisfied with their present job. What is the prob- 
ability that a respondent selected at random will report that he is satisfied 
with his job? 

2. Relative to Example 2, let £* be the event that an incorrect response 
occurs on trial n and £„ be the event that a correct response occurs on 
trial n If the probabilities of an incorrect response, given that the subject 
is in the conditioned and unconditioned state, respectively, are 

P(£, I C.) - 0 and P(E n | C„) = 1 - i 

show that 

(a) the probability of an error on trial n is 

P(E.) - (l - 7) 0 - P)*”‘ 

(b) the probability of a correct response on trials 2 and 3 is 

Pfa n Es) - p+ (l - p)^ + (i - />) 3 Q 

3. Find the general formula in Problem 2 for 

(a) P(E 2 n £ 3 ) (b) p(£ 2 n £ 3 ) (c) P(E 2 n £ 3 ) 

4. Suppose that in a certain geographic area, census data reveals that 40% 
of the rural population move to the city yearly, while 20% of the urban 
dwellers move to the country each year Initially, 60% of the population 
were rural dwellers. In the long run, what proportion of the total popula- 
tion will be urban dwellers? 

5. In an attempt to apply Bayes’ Theorem to mental testing, Calandra* 
assumes that an examinee either knows the response to a test item, 
probability p, or he guesses, probability 1 - p. Let us assume that, given 
an examinee knows the answer, the probability of his answering the item 
correctly is 1 Given that an examinee does not know the answer, the 
probability of his answering correctly is assumed to be 1 Ik, where k is 
the number of multiple-choice alternatives. 

Given a correct answer, show that the conditional probability that the 
respondent knew the item is 


kp 

1 + <* - 1 )p 

6. Three members of a four-member discussion group initially favor an 
issue, whi e the fourth member is opposed. Suppose that a member is 
selected at random and given the opportunity to defend his position 
Let us further suppose that after a given member speaks, one member of 
the opposition Of any exists) changes his position After the first member 

'? 9 a ("l 94 a i') A ' ” SCOnnB Formulas and Probability Considerations,” ftyc/WnAu 6, 


544 / BASIC CO N C 


EPTS OF PROBABILITY 



speaks, a member is again randomly selected and given the opportunity 
to speak The process continues indefinitely 

(a) What is the probability that the members will be m agreement at the 
end of the (0 third speech 9 (») fourth speech 9 (m) fifth speech 9 

(b) What is the probability that a favorably inclined member will be 
selected to give the (i) second speech 9 (n) third speech 9 (m) fourth 
speech 9 

7. The following urn scheme js proposed as a model of the spread of rumors 
through a primitive tribe An urn initially contains b black bails and w 
white balls If a white ball is drawn, it js replaced and E, the event that a 
tribe member selected at random has not heard the rumor, occurs If a 
black ball is drawn, event E ' occurs, the ball js replaced, and k white balls 
are replaced with A black balls The sampling process continues until 
the white balls are exhausted If b =* 1, w = 6, and k ~ 2 

(a) What is the probability that the second tribe member sampled will 
not have heard the rumor 9 

(b) Calculate the probability that the third member sampled has not 
heard the rumor Compare with (a) 

(c) What is the probability that the process terminates at the end of 
the third sample 9 

(d) What is the probability that the process terminates at the end of 
the fourth sample 9 

S. Bush and Sternberg* have proposed a model which views learning as a 
direct change in response probabilities from one trial to the next The 
basic idea is that learning a stimulus-response association occurs over a 
sequence of changing error probabilities denoted by gi, 92, 9 * 

The successive q values are assumed to be related by 

q i+i - aq t 

where 0 < a. < 1 

If p n denotes the probability of a correct response on trial n, so that 
p H -f q n = 1, find the probability of a correct response (a) on trial 3 and 
(b) on trial n What is the limiting probability as n — > w 9 



selects a lot at random and draws a sample of 5 units from the lot 
The decision rule is to select the most probable action state (there is an 
action state corresponding to each error state) given that the sample is 
found to have k defectives What action is chosen given that the sample 
results m 

(a) exactly 1 defective'’ 

(b) exactly 2 defectives'’ 

(c) 3 or more defectives’’ 

10 The True Tone Radio Company is negotiating a contract with a seller of 
electronic components which supplies components in 100 unit lots The 
company wishes to protect itself against the possibility that a lot contains 
too many defective components The contract therefore provides that two 
components will be randomly selected without replacement and tested 
Unfortunately, the testing procedure is not absolutely accurate It is 
estimated that on the average, 90% of all good components tested are 
declared acceptable while only 5% of the bad components tested are 
accepted 


The following alternative plans are being considered as possible guide- 
lines in making a decision whether to accept or reject a lot 
Plan 1 If both components tested are declared to be acceptable, then 
accept the whole lot Otherwise reject 
Plan 2 If both components are found to be defective, then reject the 
lot Otherwise accept 


Plan 3 If both components tested are satisfactory, accept the lot, if 
both components tested are unsatisfactory, reject the lot 
If only one component tested is satisfactory, then draw a third 
component at random from the remaining components and 
accept or reject depending on whether the component is 
deemed satisfactory or unsatisfactory 
(a) Denote by E the event that True Tone Radio accepts the lot Obtain 
R(E) for each of the three plans if there are x components in the lot 
that are actually defective 


(b) Compute P(E) in (a) when x = 5, 10, 20, 30, 50 Draw a graph for 
each plan with the value or x on the horizontal axis and P(E) on the 
vertical axis A graph of this type is called an operating characteristic 
curce (O C curve) for a sampling plan 

(c) Which plan is most favorable to the buyer’ to the seller’ 

11. Find the probability of acceptance under each plan in Problem 10 if there 
are * defective components and no error is made in testing the com- 
ponents DrawOC curves for this case 


12 . Relative to Example 4(b) 

(a) show that P(P 3 ) = r /(r + g) 

(b) Show by induction thatP(«„) = r /{r + g) for n = 1, 2, 3, 

13. (a) Use the result of Problem 12 to show that 


p (.Rn | R n ) - P(P m | R„) for any m and 


546 / BASIC co 


NCEPTS OF PROBABILITY 



(b) Let A Jtui 1) stand for ulher red or green; tlut is, the pair {A, B) can 
be any of the four combinations (K, /i), (F. O. ( G . /( ). or ( C ' 
Then show that />(.(„ i «„) = «''• I "»> for jH m dnd "• 

Id. Audley and Jonehheere- d, sense a general urn scheme fevullomoU 
learning experiments having two subjccl-conlrol led events Initially, he 
urn has r rJd balls and «• white balls. A ball ,s selected a. random If . is 
white, event £, oceurs, the ball is replaced, and w. white balls, and r, 
red b ills are added to the contents of the urn If the ball is red, event E 2 
occurs the ball is replaced, and n , white and r.. red balls are added to the 
contents of the urn. The process ,s repeated indefinitely, each repe mon 
constituting a Inal The constants ., ,, r„ n. r. and r I may - be c her 

rrrr,;::" 

balls of either color is negative 

(a) Show that in general, the results of Problem 12do not hold for Audley 

and Jonckhcerc’s urn scheme ^ ^ 

(b) Use a tree diagram to show that if r = 2, * “ 1. r, - n i 
r. = 3, and » . - I. then «R.) = ««-’> ~ nRi) ^ 

(e) Use a tree diagram to show that.fr - hn i = 0,»> 

, a event - proba., ,,y 
a sm8k urn 

containing r balls of one color and » of another color 

18.7 independent events and independent tria 

The probability P(£ 1 F) or £ conditional on the occurrence oj^^ these 

not the same as the unconditional proo y _ 1S 0 f special 

two probabdrties are the same, that rs, w en | event F 

•merest If thrs occurs we say that event £ rs mdependent of the 

, „r „ trials to predict whether or not 
Example 1 A subject is asked on each o , he su b je ct makes the 

a bulb will be illuminated Suppose that on ea ’ b jj, 1S illuminated 

same choice, say, that the bulb w.ll be dlummat fii* mdependent of the 
on a random 60% of the trials, is the subject 

actual state of the bulb? ,, b iU umin ated and F denote 

Let E denote a prediction that the bu _ p,g | /r) = 1, we con- 

actual illumination of the bulb Then, since might instinctively expect 
elude that E is independent of F In this ca , <-„biect’s prediction was 
that the events should be independent since the subjec 

not influenced by the actual state of the u ^ jp um mated 60% 

Now suppose that the subject knows t a e 

* Audley, R j , and Jonckheere, A R , ‘™tisUcal Analyse of the Learning Process, 
British Journal of Psychology 9, 87-94 (195&J 

cmT EVENTS AND TRIALS / 547 

187 INDEPENDENT EVE 



of the time Instead of making the same choice, he matches the proportion 
of predictions that the bulb will be illuminated, that is, 60% of the time he 
predicts that the bulb will be illuminated 
If we assume that the subject distributes his predictions in the same 60 40 
ratio when the bulb is actually illuminated as when it is not illuminated, then 

P(E) = 06 and P{E | F) = 0 6 

and the events E and F are independent even though the subject s choice 
pattern depends upon his knowledge of P(F) Events E and F will always be 
independent as long as the proportion of ‘on predictions is the same for 
those trials when the bulb is on as for those when it is off In a probability 
sense, this is all that is meant by independence of the events E and F ► 


When P{F | E) and P(E | F ) are both defined, the fact that E is independent 
of F implies that F is also independent of E For in this case (from Equation 
(10)), we have 


P(F | E) 


P(EnF ) 

P{E) 


P(F)P(E | F) 
P(E ) 


= P(F) 


since P(E | F) is by assumption equal to P(E) In either case, we find 


P(E n F) = P(E) P(F) 

and we use this equation in our formal definition of independence 


Definition 1 Two events are said to be independent if and only if 

P(E nf) = P(E)P(F) (17) 

► 


Wc shall find it convenient to refer to Equation (17) as the multiplication 
fuk for events £ and F 

In ordinary language, quantities which are described as independent are 
those which are fell to be totally unrelated and thus to have nothing in 
common This intuitive notion of independence often leads to the erroneous 
conclusion that independent events are disjoint when, in fact, the opposite 
is true, as is seen in the following theorem 

disjoin? 1 " 12 I " dtpindcnt c ' cnts ha ' ln S positive probability cannot be 

PROOF If E and Fare disjoint events then 

/>(£ nf)= />(«) = o 

On the other hand, if E and I arc independent, then 
F(£ n F) - P(E)P(D 

This product ,s zero if and only if ,t least one of the events cirr.es zero 
probability, contrary to our assumption ► 


648 / BASIC CONCEPTS O 


F PROBABILITY 



Example 2 If in a coin toss, the event H (heads) occurs, the event T (tails) 
cannot occur. Here the occurrence of one event affects i the occurrence oi t e 
other and we would intuitively feel that these events, though disjoint, n0 ‘ 
independent The result of Theorem 12 is thus perhaps not so contrary to 
intuition as one might at first glance suppose 

Careful study of the following examples and of 
to clarify the common sources of confusion concerning independent 

Example 3 Let E be any event in the sample space tl Then E - E n - 

3nd p(En^^m) = P(E) i=W ( n) 

so that the certain event SI is independen^ h as 

it follows that £2 is independent of itseli in » ^ 

occurred, the probability of £2 is not altered 

Example 4 (a) Let an experiment consist of tossinganhonesyiennytwic^ 

Let the event E be described by “not more ^ n ° n (HT TH TT) and 

one of each face " If SI = { HH , HT, TH.TT }, then E - {HT, 

F = {HT, TH } Assigning equal probabilities gives ) > 

P(£ n F) = 2 ** 

so that E and F are not independent , 

(b) Let a second experiment consist of tossing the pW we 

let E and F be described as in (a) Again assigning q 
find P(E) = x, P(F) = i and 

P(E n F) = i = HE) P(n 

Which means that the two events are independent f t j, B example 

Intuitively, one might feel that the results of (a) an ^ ^ 0ne must 

should be the same, but such intuition is in dependence Moreover, 

always check to see if Formula (17) holds to v e y tlle e | em ents of 

while it is true that the descriptions of The : even ^ sample spa ces differ 

the events E of parts (a) and (b) are di ■ surprising that con- 

The same applies to the two events F and thus it is n ^ 

elusions differ in parts (a) and (b) 

t , nowledse of the occurrence of F, 

If the probability of E is unchanged by als0 be unc hanged by the 

11 seems reasonable that this probability 
knowledge that F fails to occur 

Theorem 13 Two events Fend F are independent ufam. only ,f .he pairs 
(E, p’), (£', P), and (£', F') are independent pair 


independent events and trials / 



proof We shall prove the first of these statements, leaving the others as 
exercises We have E = {E n F) U (E n F'), so that P(E) = P(E f\F) + 
P(E n F') Using the assumed independence of E and F gives 

P(E n F') = P(E) - P(E n F) = F(F) - P(F)F(F) 

= F(JS)[1 - P(F)] = P(JE)P(F0 ► 

In extending the concept of independence to more than two events E i, 
£ 2 , , £„, it is natural to require that Formula (17) hold for each pair 

£■„ Ej(i 7 * j ) In this case we say that the events are pairwise independent 
At the same time, one might expect that independence of events E\, E 2 , and 
E 3 would, for instance, imply independence of combinations such as Pi O £ 2 
and P 3 However, the next example shows that pairwise independence is not 
sufficient for this purpose 

Example 5 Two people are contacted in a public opinion poll and queried 
as to whether they have a favorable F or unfavorable U reaction to the 
labeling of cigarettes as health hazards An appropriate sample space is 
12 — {FF, FU, UF, UU} Let us assign probability 4 to each single event in 
12 and consider events 

E 1 = “the first person was favorable” 

E 2 — “the second person was favorable ’ 

£3 “ “both persons gave the same reaction” 

Then 

nE 1) - P(E 2 ) = P(E 3 ) = l 

l P(E, n e 2 ) = P(£, n £3) = p(e 2 n £3) = i 

£[(£1 n£ 2 ) n£ 3 l = i ^ l = P(E , nE 2 )P(E 3 ) = P(E l )P(E 2 )P(E 3 ) 
The events arc pairwise independent but £ 3 is not independent of £r n £2 

► 


It is clear that in addition to the pairwise independence of £1, £2, and £3. 
we must also require 

£(£■ n £ 2 n £3) = £(£,)P(£ 2 )P(£ 3 ) (is) 

to define the independence of three events Equation (18) is referred to as the 
nud ti plication rule for events £,, £ 2 , and £ 3 

Definition 2 Three events £,, £,, and £3 are said to be independent if 
and only if (18) holds and (17) holds for each pair, that is, if and only if the 
multiplication rule holds for each combination of two or more of the events 

► 


550 / BASIC CONCEPTS OF PROBABILITY 



Theorem 14 If £„ E,, and £3 are independent events, then so are 

(a) Ei n £3 and E\ (b)£.u£ 3 and £1 

(c) E ' 2 and £3 U E[ (d) E ,, £!, and £J 

PROOF We prove (a) and leave the others as exercises It is easily verified 

that £j n £3 may be partitioned as 

E l n £3 = (£1 n £3 n £2) u (£1 n £3 n £D 

Thus 

P(E l n £3 n £J) = £(£1 n £ 3 ) - ^ nf > n Es) 

= y>(£,)/ > (£ 3 ) - P(Ei)P(.E 2 )P(E 3 ) 

= />(£,)/’(£ 3 ){i - 

= />(£, n e 3 )P(E' 2 ) ^ 


More generally, ,t can be proved that any event “pressiblesoldy in terms 
of and £2 is independent of E 3 , any event expressi stateme nt here 

£3 is independent of £„ and so forth (We do not prove h ^tatement here 
but refer the interested reader to Pfeiffers book, listed under Supplementary 
Reading ) 

Example 6 An established decision-making ^^o^onThree tests 
individual be accepted or rejected on the basis of minimum 

The decision strategy is to reject every individual who falls below m,n, 

cutoff score on each of the three tests , , excee ds the cutoff 

Let £ t , £ 2 , and £3 denote the events ha ^ he subject excee^ ^ ^ 

score on the first, second, and third test, r p 
cutoff points were established so that 


rP(£0 = 06 


£(£2) = 0 3 p(£3) 


. 0 1 


Assuming that £„ and £3 are independent events, what ,s the probability 
that an individual selected at random will be reJ ^ c . cuto ff score on 
Since rejection implies that an individua sc0 *" e 
all three tests, the required probability is given y 

£(£’, n E'i n £>) 

By the independence assumption, the probability of rejectio 
£(£,' n £2 n £3) = 0 + x 0 7 x 0 9 - ° " 5 " 

In extending the definition of independence to , "“I. ndl .„cc 0 f three 

"v shall follow the pattern of the definition for 
cicnts 

„„ T EVENTS AND TRIALS / 651 

187 INDEPENDENT EVE 



PROOF We shall prove the first of these statements, leaving the others as 
exercises We have £=(EnF)U(f(l F% so that P(E) = P(E n F) + 
P(E n £') Using the assumed independence of E and F gives 

P(E n £') = P(F) - P(E n F) = P(F) - F(£)F(F) 

= ?(£)[! - P(F ) 3 = £(£)£(£') ► 


In extending the concept of independence to more than two events E\, 
E 2 , , £„, it is natural to require that Formula ( 17 ) hold for each pair 

E,l E } (i 7* j) In this case we say that the events are pairwise independent 
At the same time, one might expect that independence of events E i, E 2 , and 
£3 would, for instance, imply independence of combinations such as E\ D £2 
and £3 However, the next example shows that pairwise independence is not 
sufficient for this purpose 


Example 5 Two people are contacted in a public opinion poll and queried 
as to whether they have a favorable F or unfavorable U reaction to the 
labeling of cigarettes as health hazards An appropriate sample space is 
ft = {FF, FU , UF, UU} Let us assign probability 4 to each single event in 
ft and consider events 

£1 = “the first person was favorable” 

£2 *= “the second person was favorable” 

£3 = “both persons gave the same reaction” 

Then 

P(E 0 = P(E 2 ) = P(£ 3 ) = l 
P(£, n £2) = £(£1 n £3) = £(£ 2 n £ 3 ) = l 
£[(E! n £2) n£ 3 ] = \ * £ - £(£ t n £ 2 )£(£ 3 ) = P{E{]P{E 2 )P{E Z ) 
The events are pairwise independent but £3 is not independent of £1 n £2 


It is clear that in addition to the pairwise independence of £1, £ 2 , a °d £31 
wc must also require 

P(£i n e 2 n e 3 ) = i>(£,)p(E 2 )p(£ 3 ) (is) 

to define the independence of three events Equation ( 18 ) is referred to as the 
multiplication rule for events £,, £ 2 , and £ 3 

Definition 2 Three events E u £ 2 , and E 3 are said to be independent if 
and only if ( 18 ) holds and ( 17 ) holds for each pair, that is, if and only if the 
multiplication rule holds for each combination of two or more of the events 

► 


550 / BASIC CONCEPTS OF PROBABILITY 



Theorem 14 

(a) Ei nE, 
(c) E'> and 

PROOF We 
that JiflEj 


If Ei £„, and E 3 are independent events, then so are 
and eC (b)£..u£s and £, 

£ 3 u E[ (&)Ei,Ei, and £J 

prove (a) and leave the others as exere.ses It ,s east.y verified 

may be partitioned as 

£1 n £3 = (£1 n £3 n E,) u (£1 n £3 n £2) 


Thus 

£(£, n £3 n £1) = n Es) ~ p(£l n E " n E ' ,} 

= P(Ei)P(E 3 ) - P(.Ei)P(E 2 )P(E 3 ) 

= P(£,)/>(£3)[1 - P< - E2)] 

- />(£, n e 3 )P(E' 2 ) ^ 

ent expressible solely m terms 
More generally, it can be proved that any presS ible in terms of £2 and 
of Ei and £ 2 is independent of £ 3 . not pro ve this statement here, 

£3 IS independent of £i, and so forth ( e i.eted under Supplementary 

bat refer the interested reader to Pfeiffer s book, 

Reading ) 

making scheme requires that an 
Example 6 An established decision-rr la 0 b ^ scores on three tests 
individual be accepted or rejected on t ic falls below a minimum 

The decision strategy is to reject every individual 

cutoff score on each of the three tests subject exceeds the cutoff 

Let Ei, £ 2 , and £3 denote the ev, ents thi v ely Suppose that the 

score on the first, second, and thir ’ 
cutoff points were established so that 

- ' £(£3) = 0 1 


P(Ei) 0 6 P(£c) ^ events, what is the probability 

Assuming that £1, £2, and £3 are ‘ ndep '", b rejected 7 
that an individual selected at random w > below the cutoff score on 

Since rejection implies that an 1 in > '" giyen by 
three tests, the required probab y 

p{E\ n £2 n E '^ 

„ thc probability of rejection is 
by the independence jssumptm , „ -,c-» 

, _ 04 X 07 X O’ " u - " 

p(e; n fina- 11 

, 0 more than three events, 
■n extending the definition of mdepen^ ^ ^ ...dependence of three 
"c shall Tollow thc pattern of the dc 

VENTSAN oTn, A ts /551 

,37 ,NDEPE NDENT 


= 03 



Definition 3 Events E„ E 2 , , E„ are said to be independent if and only 

if the multiplication rule holds for each combination of two or more of the 
events 

Thus a multiplication rule is required for every subcollection of events 
£ u , E h except those containing none or one of the events Since 

there are 2 ‘ possible subcollections, of which one is empty and n contain a 
single event, there are then 2 n - n - 1 equations which must be satisfied in 
order that the events be independent 

Again it is possible to prove that any two set-theoretic combinations ot 
independent events must themselves be independent if they involve no event 
in common Thus for example, if E u F z , , Ei are independent, then so 
are (£1 n £4)' U £5 and ££ n £ 6 (Details explaining these results may be 
found in Pfeiffer, Chapter 2 ) 

Situations involving several independent events most often arise in con- 
nection with experiments composed of sequences of trials 

Example 7 Let us suppose that in the public opinion poll of Example 5, 
60% of the population view cigarettes as a health hazard while 40% do not 
The polling of one person constitutes a trial with possible outcomes F and 
U Denote by F n and U„ the obtaining of a favorable or an unfavorable 
reaction, respectively, on trial n, that is, the nth person polled 

If the trials are such that on trial n, P(F„) = 06 and P(U n ) = 04, regard- 
less of the outcomes of other trials, then we have a sequence of independent 
trials The probability that, for example, the first 10 persons respond F and 
the next three U is then (0 6) ,o (0 4) 3 Indeed every particular sequence of 13 
responses, 10 of which are F and 3 of which are U would have the same 
probability Since there are 

(?) 

different arrangements of 10 symbols F and 3 symbols V, the probability 
that of the first 13 responses, exactly 10 are F is 

(’3) (0 6) ln (0 4) 3 ► 


In applications of the theory of probability, independence usually arises, 
as in Example 7, as an assumption on which the assignment of probabilities 
is based Whether this is a proper criterion for making the assignment is, of 
course, not a part of the theory, but must be decided on the basis of the 
experience and judgment of the experimenter Certainly, most of us would 
consider repeated tossing of a single com as involving independent trials, 
while repeated presentation of the same learning situation to the same 
subject would not be so considered In cither case, the probabilities assigned 
arc at the discretion of the user of probability The theory can only be applied 
to deduce new probabilities from assumed ones 

552 / BASIC CONCEPTS OF PROBABILITY 



On the other hand, if, having made certain assumptions, results are ob- 
tained which are at variance with one’s intuition, then either assumptions or 
intuition must be changed Example 4 is a case in which intuition is modified, 
while the next example illustrates the opposite situation 


Example 8 Numerous investigations have found that human subjects are 
unable to generate a random sequence of responses w en so ms rua 
In one such study conducted by Brown,* 48 college stu en s were ea 
instructed to generate a sequence of heads X and tai s , w ic w 
like the results they would expect if a balanced com .were i tossed 17 i times 
If the trials are independent, we would expect to find that the P P 
of 0 on trials following an O should equal 0 5, as should t e P r p 
0 on trials following a run of two O In fact, the proportion of 0 fol owing 
any pattern whatsoever should equal 0 5 Table 3 shows the results obtained 


TABLE 3 


Immediately 
preceding response 
pattern 

Proportion of 

0 following 

Immediately 
preceding response 
pattern 

Proportion of 

0 following 

0 

X 

00 

ox 

xo 

XX 

0 43 

0 59 

0 38 

0 55 

0 48 

0 66 

000 

OOX 

OXO 

OXX 

XOO 

XOX 

XXO 

XXX 

0 30 

0 53 

0 42 

0 65 

0 41 

0 56 

0 53 

0 67 


It IS obvious that the results are not in close tended to avoid 

■f trials are assumed to be independent Subjcc s PP exjm plc, subjects 
subsequences of responses that looked patternc The av01 d an cc 

"ere more likely to generate the subsequence ^ response 

°r perceived patterning is particularly apparent in 000 x vs OOOO, 
Patterns as evidenced by the preference for subsequences OOOX 
V.VA'O vs XXXX, and OXXO vs OX XX f in j e pendencc 

On the basis of the empirical evidence, ^."‘'^a^ualing whether 
does not appear warranted More sophistica ^ ^ expected from 

°r not response runs of X or O arc cxccssise lls book t ^ 

genuinely independent trials arc beyond the scope 

firoun d L Non independence in SuLjccU'tb Ramloni ^ New Jcrwr>. 
nonat Testing Sen ice Research Bulletin No HU « 

H or an introduction to the thcor> of runs see Teller £ jy6S) Nol 1 3rd cd.. 

a U Applicants (John W.ky & Sons Inc . New 
t>J ptcr U 

, 87 INDEP EM DEN T CVC11TS ANO TH.ACS / * 



Example 9 The sales manager for the MT Brush Company estimates the 
probability that a house call results in a sale to be i If house calls are 
assumed to be independent trials, what is the probability that a new salesman 
makes his first sale on the kth house call’ 

Let S, denote the event “ith house call results m a sale” and F, the event 
“ith house call does not result in a sale ” In order for a salesman to make 
his first sale on the kih trial, he must of necessity have been unsuccessful in 
making a sale on the preceding k - 1 trials and successful in making a sale 
on the fcth trial Hence, the event E L “first sale on the kth house call” is 

Ek ~ F\ n f 2 n n f*._ 1 n Si 

Since the trials are assumed to be independent with constant probability 
of Si, 

P(E k ) = P(F0P(F 2 ) PiF^PiSO 

4 v 4 V V 4 Vi 

— 5 5 ^ 5 5 

= l&-' 

Since m this example, k may be any positive integer, we see that inde- 
pendence can be applied to an infinite as well as finite sequence of trials ► 

PROBLEMS 

1. Lorge and Solomon* have developed a model of group problem-solving 
behavior If it is assumed that k group members work independently and 
that each member has probability p z of solving the problem, show that the 
probability p G that the group of k members solves the problem is 

Pc = 1 - (1 - P jY 

Comment on the effect of increasing group size 

2. Suppose that it is further assumed that there are s independent stages in 
the solution of the problem in Problem 1 and that 

Pi = P\P2 p . 

when. Pi is the individual s probability of solution at stage / Assuming 
that pi =* p 2 = = p„ and that the group solves the problem only 

if, at each stage, at least one member solves that stage, show that 

Pg = [I - (1 - p} u ) k Y 

Comment on the effect of increasing the number of problem stages if 
\se assume that pi = p} u is constant 

3. The Sc>more Company makes rectangular glass portholes according to 
contract specification for installation in space vehicles Because of the 

•Lorge, I . and Solomon, II , * Two Models of Group Behavior m the Solution of Eureka- 
T>pc Problems,* Psychometnka 20, 139-148 (1955) 


554 / BASIC CONCEPTS OF PROBABILITY 



nature of the mountmg, tolerances for length and width are especially 
critical It is known that 2% of the units exceed ‘okrance specificahons 
for length, 3% exceed tolerance specifications for width, and 1% exceed 
tolerance specifications for thickness A porthole is acceptable i fo. r r de .very 
only if its length and width are within the specified tolerances ‘hat « a 
unit may exceed the tolerances on thickness and still be acceptable 
provided that the length and width are within the specified 
If dimensional variability ,s assumed to be independent what is the 
probability that at least one of six units selected at random is accep 
for delivery 9 

4. A test has five multiple-choice items each with four choices, only ' one of 
which ,s correct Suppose a student guesses the answers each question 
Assuming that his guesses are independent, define an app P J 

space and assign probabilities to its simple events Find the probability 

that the student correctly guesses 

(a) none of the items (b) all of .he items 

(Oat least one item (d) a. most, three items 

5. (a) An experimenter needs at least one subject *" “ C indicated 

experiment Suppose that he has a lisl » P^f^h person contacted 

an interest in participating in his exper acted form inde- 

has probability of i of showing 

pendent trials, how many persons sh at i east 0 99 7 

to insure that the probability of at least one showing is at ea 

(b) What is the probability that the fourth sub ^ c ° n aC 
experimenter is the first to show for the experi 

6. The ABC Brush Company has five salesmen *1” on 

territories If the probability is 0 2 that . a sal ^ ^ ^ fiyc salcsmLn 
any given day, what is the probability that a 

will make a sale on a given day 9 

1- Suppose that you are constructing a gumption or inde- 

'< and B described below Indicate your opinion 

pcndence would be appropriate and stJ ^ ^ (he C , JSS on , hc 

(a) A IS the event that a student is in the upper V, , s m t he lower 
first class quiz and B is the event that the sum 

25% on the second quiz jn , n nj„onary trend 

lb) A is the event that the national econon y 

and B ,s the event that interest rates are ch k h „ t 

(0 A is the event that subject A learns a list m 10 trials and B is the e 
that subject B learns the list in 5 Iria s . political parly is 

(d) A is the event that a presidential candi ^ f h same parly is 
elected and 0 ,s the event .hat a city ma>or ol 

Uectcd school and O the 

(v) A is the event that a student has high nur s of t he ABC 

event that the same individual bccon t 

Company. 


187 .NOEPENOENT EVENTS ANOTB.AES / «* 



8. Each of two groups of people contains n members Suppose that in each 
group, the members have been ranked in descending order according to 
their scores on a psychological test If one person is selected at random 
from each group and if we assume that the selections are independent, 
what is the probability that neither of the persons drawn falls below the 
with rank order'’ 

9. Continuing Example 6, suppose that the decision strategy is to reject 
those individuals who fall below a minimum cutoff score on a majority 
of the three tests What is the probability of rejection under this scheme 7 

10. A group of 1000 voters are interviewed in May and again in August 
concerning their party choice in the forthcoming election Let R, D, and 
U denote, respectively, the events that the respondent intends to vote 
Republican, Democratic, or is undecided Probabilities of pairs of 
events are given in the following table 





August 





R 

D 

U 



R 

0 275 

0 220 

0 055 

0 550 

May 

D 

0 200 

0170 

0 030 

0 400 


U 

0 025 

0 010 

0015 

0 050 



0 500 

0 400 

0 100 



For example, 0 275 is the probability that a respondent intends to vote 
Republican in May and has the same intention in August, whereas 0 550 
is the probability that a respondent interviewed in May intends to vote 
Republican and 0 500 is the probability that a respondent interviewed 
in August intends to vote Republican Given the above information, is a 
respondent’s voting intention in May independent of his voting in August 7 

U. In Example 9, show that if events £,, E 2 , £ 3 , are assigned prob- 
abilities P(E k ) = p(l — p)k ~ i where 0 < p < 1, then £t-i F(£0 = 1. 
that is, this is a legitimate probability assignment 

12. If E and F are independent events and G is a subset of E, is it necessarily 
the case that G and F are independent 9 

13. Is independence a transitive relation 9 That is, if E and F are independent 
and F and Care independent, does it follow that £ and G are independent 7 

14. Show that 12 and <£ are each independent of every event Further, show 
that the only events which are independent of themselves are events 
having probability zero or probability one 

15. Continuing Example 4, suppose the penny ,s tossed „ times Show that 
events E and Fare independent only if « = 3 

16. Complete the proof of Theorem 13 

17. Complae the proof of Theorem 14 


5SS / BASIC CONCEPTS 


OF PROBABILITY 



SUPPLEMENTARY READING 

Feller, W„ An Introduction to Probability Theory and m ^pUcatutns (John Wrley 
& Sons, Inc , New York, 1968), Vol. I, 3rd ed , Chapters 1 and 5. 

Goldberg, S„ Probability, An Introduction ( p rentice- H all, Inc ^Englew°od C i s, 
N. J., 1960), Chapter 2. (An excellent elementary text on p 
Parzen, E , Modern Probability Theory and its Applications (John Wiley & Sons, 
Inc., New York, 1960), Chapters 1 and 2. 

Pfeiffer, P. E , Concepts of Probability Theory (McGraw-Hill Book Comp y, 
New York, 1965), Chapters 1 and 2. 


$UPPU M ^ 


MTAItV BEADING / 



DISCRETE RANDOM VARIABLES 19 


191 RANDOM VARIABLES AND PROBABILITY FUNCTIONS 

Some experimental situations directly produce numbers as outcomes, while 
others produce results in the form of numerical values, such as ranks or 
ratings, that are later assigned to the actual outcomes to aid in the description 
and analysis of experimental results An assignment of numbers to outcomes 
of an experiment or, more precisely, to elements of the sample space associated 
with the experiment, produces a function whose domain is the sample space 
and whose range is some set of real numbers Such a function is called a 
fandom vai table 

Example 1 Suppose we are to toss a coin three times and that we will 
receive $1 for each head that appears The set 

U = {HHH, HHT, HTH, HTT, THH, THT, TTH, TIT} 

is the natural sample space for this experiment 
However, we are not interested in the exact sequence of heads and tails 
but rather in the number of heads each sequence produces With any par- 
ticular outcome of the experiment we are interested in associating the 
number of heads occurring in that outcome This association determines a 
random variable N whose domain is fi and whose range is the set {0, 1, 2, 3} 
The value which N assigns to a particular element of 12 is the number of H 
in that element Thus, 

N(HHH) = 3 = 2, N(HTH) = 2, N(HTT) = 1, N(THH) = 2, 

N(THT) = 1, N(ITH) = 1, N(TTT) = 0 ► 

558 / DISCRETE RANDOM VARIABLES 



The term “random vanable” is somewhat unfortunate m that . such jn 
object is not a variable but a function, nor is it in any se " se ran[ j om 
However, this terminology ,s standard As long as the concept : o f a ™dom 
variable as a function is kept firmly in nund no confus on shou.d^sul 
Random variables provide a useful shorthand -otat'on fo r < design S 

certain events For instance, the set of elements in e P d , 

Example 1 to which the random variable N assigns the value 2 is denoted by 

(« = 2) That is, 

(,V = 2) = {HHT, HTH, THH } 

In general, if X is a random ‘Variable : whose Xch X 

the notation ( X = x) is used to denote the camtal letters 

assigns the value a: We follow the common practice of g 
for random variables and lower case letters for real numbers That is, 

(X — \) = {« X(oj) — x > 

Similarly, (X < x) is the shorthand nota "°" than x, and so 

of those elements in 12 which are assigned » 

forth 

Example 2 In Example 1, (N - D = ™T' ,!!> '■'’) - {HHH}. 

{«««}. (* < 1) = {TIT, HIT, THT, TTH}£*(" > J nolc the 
while ( N < 0), ( N = ]), (tV - 7), (N = -2). - 1 " 11 ^ ' ► 

empty set 

problems , u - m 

. , .h,. .vents symbolized by (N - U). 

1. Referring to Example I, write th 3) 

(M>2).(JV<0),(i <W<2).andbV<3) ^ ^ ^ 

2. Suppose that in Example t the probabilmeSjVjen Wj, f 0 ? each real 
and f’g are assigned to the rtspcc iv«- 

number v, compute P(N = 

3. In Problem 2 , compute P(A < v) f° r '• ach r '“ l1 " 

Corresponding ,o cad, random variable X * a fune.ion ,v defined by 

/>v(») ” «Jf - «) 4 

which assigns to eaeh real number v the prob > function ft defined 
this function is called the prolM,) f««ci,on of 
by 

/\<0 “ W X - X) 

lv called the ilmribution jmu non of A 

pnOOAD.L.TY -U-ICtlO-lt / »’ 



problems 

4 For the random variable N .n Example 1, draw graphs of Us protab, l.ly 
function (Problem 2) and its distribution function (I roblem 3) 


Example 3 In Example 9, Seel, on 18 7, define the random vanablc IK to 
be the number of the house call on which the first sale is made Then the 
event (IK = k) is the same as the event £i described in that example, 
probability function pu of IK is given by 

p„ (*) = F( IK = k) = />(£,) = >CA. tsa positive integer 


p„ ( k ) = 0 if k is not a positive integer 

A graph of the function pu is shown in Figure 1 The functional values arc 
zero everywhere except at the isolated positisc integer points 


FIGURE 1 


Pw 



The distribution function of \V is given by 
Fi*(vv) = P(W < iv) = 0 for h < l 

F\v( w) = P{\V < u>) = 2 PQV = k) = 1 - (f) Iwl for » > 1 

k=.i 

where [w] denotes the largest integer less than or equal to w For example, 
P( w < 2) - P(W < 2 4) = P{W = 1) + P ( IV = 2) = 1 - Of 
P(W < 3 29) = P(W < 3) = P{W = 1) + P(\V = 2) + P(W - 3) 

= i - (if 


Figure 2 shows the graph of the distribution function F» Note particularly 
that for points between two integers the values of F» remain the same 
The only change in F\y occurs in jumps taken at the integer points 1, 2, 3, 

For each k, the magnitude of the jump at the point k equals P(W = k) ► 


560 / DISCRETE RANDOM VARIABLES 



0 


l 


3 


Example 4 An outcome of the experiment of flipping the spinner shown in 
Figure 3 is some number between 0 and 1. Let us take fi — {x — > 


figure 3 


o 



, . , % , , r , .uu Y which indicates the 

as the sample space and define the random * ’ 

point at which the spinner stops, by 


X(\) = v 

each iga.lf we assume thal the probability lha l then the 

a "y Stvcn interval is proport.onal to the length of that mter 
distribution function of .V is given b> 


F\(v) = P(X < v) 


1 0 if v < 0 

x lf0<v£' 

| if v > 1 


^ ‘ : graph of F\ is shown in Figure -i 

P „O»A 0 .UTVfU«CT.O« S / 


PROBLEMS 


4 For ihe random variable N in Example J, draw graphs of its probability 
function (Problem 2) and its distribution function (Problem 3) 

Example 3 In Example 9, Section 18 7, define the random variable \V to 
be the number of the house call on which the first sale is made. Then the 
event (W — A ) is the same as the event £*, described in that example The 
probability function p\\ of \V is given by 

P\\ (A) — P(IV = A) = P(E k ) = i(s) l_t if A. is a positive integer 
p\\ (A) = 0 if A is not a positive integer 

A graph of the function is shown in Figure 1 The functional values are 
zero everywhere except at the isolated positive integer points 

FIGURE 1 


?w 



Ftt<W) = nW ^ ^ - £ AH' -*)-!- (IV 1 for * > . 
where [w] denotes the largest .nteger less than or equal to u> For example, 
nW S 2) " ^ 2 4) = PiW = l) + P(lv=2)=l _ 

P(tr < 3 29) - P(iy < 3) = P(iy = l) + P(ly = 2) + p( W ~ 3) 

= i - m 3 


560 / DISCRETE RANDOM VARIABLES 



probability function : 


P(A = 0) = 0 1 P(A = 1) = 0 \c 

P(A = 2) = 0 1c P(A = 3) = 2 3c 

P(A = 4 ) = c P(A = 5) = 0 4c 

and P(/4 = k) = 0 for all other values of k 

(a) Determine the value of the constant c 

(b) Find P(A > 3), P(1 < A < 4), P(2 < A < 5), and 

P[(A < 1) U (A > 4)] 

(c) Graph the distribution function of A 

7. The amount of rainfall in Seattle during the month of November is a 


- -Ilk U1I1UUI11 laillJUll 111 Jl-UUK. uuimg 

random variable R having the distribution function 


Fn(x) = P(R < v) = 


|0 if ’ 

u - \ e ~ x lf ; 


< o 
> o 


(a) Draw a graph of Fr 

(b) Find P(R < 0), P(R < 7), P(2 < R < 5), and P(R > 2) 

8 - In the context of Problems 9 and 10 of Section 18 3, let the random 
variable L denote the length of the cycle to which the first member of a 
five-member group belongs (L = 0 denotes absence of membership in 
any cycle ) 

(a) Find the probability function of L 

(b) Graph its distribution function 

(<0 What is the probability that this individual is a member of a cycle of 
length at least 3’ exactly 31 at most 3’ 

(d) Find the smallest number / such that the probability or (L > I ) is at 
least 

M What ,s the conditional probability that Ibis individual is a member 
of a 4-cycle, given that he is a member of some eye e a 1 
' s fU = 4 | L > 0)? 

9 - Assume there is probability J lhal an applicant qualities for a 
"nth smart Consulting, Inc Three applicants are independe * 

""■wed Let Q denote the number of applicants who qualify. Wlhe number 
" h <> do not, and define Z - Kit? - M ~ » Di;finc ** r ‘ md ° m 
"enable At by 

Xi = 1 if the lirst applicant qualifies 

A', = 0 if the lirst applicant fails lo qualify 

(J > Argue lhal .V, and Z are diluent random variables (Ilia. 
functions defined on the sample space), but mat 
functions are identical . . 

lb) Argue that there arc infinitely many random \arub es w 0ic 
ubiluy functions are identical with that of A i 

,3, PBODAt.tl.lTV FUNCTIONS / 



FIGURE 4 





Knowing F\, we may compute other probabilities For example, 

P(0 3 < X < 0 45) = P( X < 0 45) - P(X < 0 3) 

= F\ (0 45) - A (0 3) 

= 0 45 - 03 = 0 15 
and 

p ( 01 < X< 26) = F\ (2 6) - A (0 7) 

= 1-07 = 03 

The distribution function goes us the information we need about the 
random variable * On the other hand, since the length of an interval 
containing a single point is zero, the probability function or X is given by 

P\(jc) = P(X = x) = 0 for all real numbers x 
about" Jr* 156 ' ttl5n ’ t,le probabllll y function gives relatively little information 

PROBLEMS 

5 Pc/’T COmPU " PiX> °V’ n-0Z<X<0m) and 


consist” ordy oTiurnn '“hT ", “ Exampfe 3 - "W distribution function 

^n ring InhdZte'rand^ TT ^ F ' gUre 2 > 15 

ability function, listing all possible " a W* h c0 " venlcnt t0 use ,hc prol> 
these values Random variables such as Tin P the Probabilities of taking 
functions take no jumps, and hence are" ' P ’ Wh ° SC d,stnbut '° n 
For these random variables it rs more muous - are called conlmuous 
function We shall discuss discrete convcment t0 use the distribution 
continuous case in ChapTr 21 n ‘ nd ° m Va ™ bl “ -mug the 

PROBLEMS 

6 Nor, ^Carolina “““T ^ ° n a da k m Chapel Hill, 
Norih Carolina, is a d,screle random vanable A having the following 


562 / 


discrete RANDOM VARIABLES 




0 


Of course, if k is not one of the integers 0, 1, 2, , > , then P(N r - k) 0 

Because of the coefficient 

0 

appearing in (1), N r is often called a binomial ; andom ia> table 

Example 1 Deutsch and Madow* are concerned about the problem of 
distinguishing between “genuine wisdom” in bureaucratic decision ma 'ers 
and ‘pseudowisdom,” which results from the accidental making ol correct 
decisions In their discussion, they assume that every decision o eac ecision 
maker is made independently of all others, that the organization can is 
tinguish “correct” from “incorrect * decisions and that there is a uni o 
probability p of being correct which holds for all decisions an eci 
makers In short, individual decisions constitute Bernoulli trials 
The probability that an individual is correct in k of n decisions is 


0 


sy« 


- pT 


| p-(i- p)° = p' 


For k = n, this becomes 


vv 

Hence, the probability that, of m individuals each making « decisions, at 
least one makes all n decisions correctly is 

1 - (I -/>")“ 

For a specified value o, the smallest number of .ndmtoh - ^ » 
■hot the probability equals u that at least one person makes all « dcc.sio 
correctly is approximately 

lo g (I ~ u) 

= log(T-~7") 

lf P = 0 5, U = 0 8, and « = 8, this nu,1 ’ b "V,V,ndiudtIds nny pmducc 
probability is about 0 8 that an organization of cVL „ though the 

wsl oni - P«s°n with •> perfect record on eig 1 e ^ , coins ^ 

n diwdual decision makers are no more compel 

problems obtained 

— Reluming to Problem 1, lei W denote the n “ n | h ihc fonn of 
rind the probability fund, on of V and >how 
kqualion (1) uilh r — 3 


^ -e.h K \V and Madou W G A Note 


MX 


“ 'V and Madovh W G Ar»oicu« . 

“•wjlic Or^mationi r^t Sct< *** ■ 


19 2 


qtnMOULU 


THIA15 


/ Wi 



19 2 RANDOM VARIABLES ASSOCIATED WITH 
BERNOULLI TRIALS 

The most important discrete random variables arise in connection with 
Bernoulli* tnals A sequence of trials is a Bernoulli sequence if 

( l ) The trials are independent 

( 11 ) Each trial results in one of two possible outcomes which are arbitrarily 
called success S and failure F 

(m) The probability p = P(S ) of success is the same on each trial 

Repeated tossing of a coin, the public opinion poll of Example 7, Section 
187, and drawing with replacement from an urn containing balls of two 
colors are typical examples of Bernoulli trials The designation of success 
and failure is quite arbitrary and is used only to present a standardized 
terminology In tossing a com, heads represents success for one player but 
failure for the other Each uses the model according to his own tastes 
Since there are two possible outcomes on each trial, a sample space for r 
trials contains 2X2X X2 = 2 r elements, each denoting one of the 
possible sequences of S and F which might occur The assumption of in- 
dependence implies that the probability which is assigned to any sequence 
having exactly k successes S and r — k failures F is p l (l — p) r ~ k Once p is 
known the entire probability assignment is determined 

PROBLEMS 

1 Suppose the model of Bernoulli trials had been applicable in Example I 
of Section 19 1 and that p - P(H ) = j Determine the appropriate 
assignment of probabilities to the simple events 

Let us denote by N r the random variable whose values represent the 
number of successes in r Bernoulli trials The possible values of N r are 
0, 1 2 i The event ( N T = k ) contains all those elements of the sample 
space which are arrangements of exactly k successes S and r — k failures F 
Since any such arrangement is determined by the choice of k positions in 
which to place S, the number of arrangements is 



the number of ways of choosing k objects from i objects It follows that 

P(W, = k) (Qp‘(l - pj-’- if A = 0, 1, 2, , - 0> 

♦After James Bernoulli (1654-1705) one of the founders of the theory of probability 


564 / DISCRETE RANDOM VARIABLES 



latency time are often assumed to be generated by an underlying probability 
process. One model of such a process is the following, due to McGill. 

Assume that a system generates a Bernoulli sequence of trials with possible 
outcomes R, response, and R , no response. By assuming t e tria s are o equa 
duration, the number X of R outcomes occurring between two ou comes 
may be considered as a latency random variable. If P - W denotes the 
probability of outcome R , the probability function of X is given y 

P{X- 
so X is geometric. 


r k) = p( 1 - Pf When k = 0 ’ 2 ’ ' 


A characteristic feature of any process which leads to a S'™*™™/™ 
variable is its “lack of memory.” To make this concept preci , 
to the random variable IK. Assuming that the first r trials resit 1 ■ f “ lure > 
the probability of exactly k additional failures be ore t e rs ^ 

P( W = / + *)_ _P^SzJP - — 

P(IK = t + k I w > D = p(W>7) E‘=1 PC - py 

POP malteT ^-he^ding 

probability function of the number of additional ta 
success remains the same as it was initially. 

Example 2' The “lack of memory” of geometric that R 

that no matter how many R outcomes have occurre . ^ ^ of course, 
Recurs in the next trial is still p, as it was for t ^ 

is a reflection of the independence of the trials. 


PROBLEMS 

mm'nft’ of Bernoulli trials there are at 

4. What is the probability that in a seq ^ 
least four failures before the first success . 

5. Let 1 - , be the probability tiS'what b 

world series game. Assuming tha ta ^ why docs this answer 

the probability that >our team sweeps In. sent • 

coincide with that of the preceding pro 

If Ik. denotes the number of failures before lhe " ,h “^,‘b"cu C tl’yfc 
(**'. = k) occurs if and only if the mb success ^ „jih trial, llicrc 

failures. Thai is to say, the mh success occurs _ , uij |,. 

»*i»S k failures and „ - I successes on the prcced.n,, x 

if of ^ i)rA««* 

w. J.. "Suxhaelic Latency Mechanisms." w ,j c> A Seine. Inc.. Nee.' 

‘‘O.Lucc. H. D.. Bush. H. K.. and Calantcr. u. »— 

1%J). 


, 9 .j PERMOULU TI.IAIS / 



3 By interpreting the sum 



A i - p) T ~ l 


as a binomial expansion, show that the probabilities in (1) sum to unity 


In arriving at a binomial probability function, we used as our random 
variable the number of successes in a fixed number of Bernoulli trials 
Suppose we now take the contrary point of view and consider the number of 
trials required to obtain a fixed number of successes 
Consider first the random variable IV representing the number of failures 
before the first success Then IK + 1 is the time (number of trials) one must 
wait for the first success The first success, if it occurs at all, may come at 
any trial Thus 0, 1, 2, all are possible values of W The event that success 
never occurs, or that all trials result in failure, is denoted by (IV = <z>) 

If W sa k, where k is some non negative integer, it means that the first k 
trials each resulted in failure, followed by success on trial k -f- 1 Conversely, 
if a sequence of outcomes begins with k failures followed by a success, then 
W — k Hence 

(W = k) — FF FS 
k failures 

and we have 


P(W = *) = (!- PX 1 - p) (1 - p)p 

= p(l -p) 1 k = 0, 1 , 2, (2) 

Because of the form of the probability function (2), W is called a geometric 
random vai table 

>r P > o the sum of the probabilities in (2) is 


E P(W = k ) 
*_0 


E P ( i - P f = p 

X — 0 


1 

I - (1 - p) 


= p= 1 

p 


In this case P(\V = co) must be zero On the other hand, if p = 0, each 
probability in (2) is zero and thus 


E P(fV = k) = 0 


P(W = go) = I 

In short if the probability of getting a success is positive, there is a probability 
of 1 that a success will occur eventually while if the probability of success is 
zero there is probability 1 that success never occurs 


Example 2 In psychological experiments the time interval between 
stimulus and response is called the latency period Observed fluctuations in 

566 / DISCRETE RANDOM VARIABLES 



gth success. Thus, for r > 0, 

P (n - ,) = (*-}! ') am m ~( 8+f f zr') w +/ - r 

For instance, if g = / = 10, then the probability that seven or more bills 
remain in the finance committee is 

> 


P(N > 7) 


r = 7 r= = 7 


PROBLEM S 


6. Verify the calculations m Example 3 

7. In Example 3, P(N = 0) is the probability 


U’m') 


at 


that the number of failures before the g.h success is a, least/. Why is 
this so? 

The model of Bernoulli trials applies also to prob * el “ 

sampling with replacement from a ^"‘“Inurncomaining /black balls and 
individuals. To fix ideas, suppose we have an urn cor a g 
8 green balls. If we draw with replacement from this urn, the successive 

form Bernoulli trials with 


p — P (black) = 


and 


( = 1 - p = P (green) = 


b + g 


Hence, if N r denotes the number of black balls obtained in r dra 
P(N r = k) = (jf) p L q'~ L = (ft) (bTl) (* + s) 

'n accordance with Equation (1). , insider draws without replace- 

Now let us alter our model ^ h >^"? rl Tsmcc ,hc trials are no longer 
ment. Then we no longer ha\e Bernoulli condition for Bernoulli 

^dependent. Note, however, that this is special kind of Pol>a 

lr, Jls which fails to hold. Since we are dealing ‘ dfaW 

Urn > tin* probability of black remains the same ‘ j { f t hc original 

Tlie result of r draws is to choose a subset ofrtM 
b + S balls. Wc shall use a sample space cons 


cr) 


192 bcrnoulu rniftis / m 



k failures and n - 1 successes in any order 


S 


f S s F S 

k + n- 1 trials (fc + n)lh trial 

Independence of trials implies that 

m . . » - f t 2 rsr s ) >< >) 

= ~ I + ~ p t k = °’ '• 2 ’ |3) 

The first probability in the second equality of (3) is the probability of 
exactly n — I successes S in it — 1 + k trials and is obtained directly from 
(1) Note that when n = 1, (3) reduces to (2), as it should A short calculation 
shows that 



is the same as 

(see Example 5 in Section 16 1), so that (3) may be rewritten in the form 

POV = *) - (7)p”(-9) 1 W 

where q — 1 — p = P(F ) Because of the quantity — n which appears in the 
binomial coefficient, W n is called a negative binomial landom variable The 
quantity — q has no probabilistic interpretation 

Example 3 Consider a legislative process in which bills are sent either to a 
finance committee or to a general committee Bills are then called for action 
by choosing one of the committees at random and asking that committee to 
present one bill If we identify success ’ with the choice of the general 
committee, the successive choices of committees constitute Bernoulli trials 
with p = P(S) = £ 

Suppose that f bills are assigned to the finance committee and g bills to 
the general committee Let the random variable N be the number of bills 
remaining in the finance committee at the time the last bill is called from the 
general committee Then for r > 0, (N = r) occurs if and only if there are 
f 1 failures (that is bills called from the finance committee) before the 

568 / DISCRETE RANDOM VARIABLES 



gth success. Thus, for r > 0, 

P(iV=r)=( ? -}+f- r )(i)"(i ) / - r =( 1 


g + /— r - A /iy+f-' 
f~ r / 2 


For instance, if g =/= 10, then the probability that seven or more bills 
remain in the finance committee is 


P(N > 7) = 


10 i 2 . / 19 - r \ ^,20-r . 

E PW = r) = E ( ,0 _ r) (a) 

r=7 i- 7 ' 


0 046 


PROBLEMS 

6. Verify the calculations in Example 3 

7. In Example 3, PIN = 0) is the probability 


E (* + l ~ ') (i) ' + " 

A-/ ' ' 


that the number of failures before the gth success is at leas, f Why is 
this so 9 

The model of Bernoulli trials applies also to types 0 f 

sampling with replacement from a finite p°pu a ' ta]ning 6 black balls and 
Individuals To fix ideas, suppose we have a " U ™ th ® succ ess,ve draws 
g green balls If we draw with replacement from this urn, 
form Bernoulli trials with 


p = P (black) = -j^Tg 


and 


q = 1 - p = P (green) 


g_ 

b + g 


Hence, ,f N r denotes the number of black balls obtained m r draws we have 

. , , \ r-k 


P{N r = k ) ■ 


(k) pkqT ~ K = (*) G + ^ 


"> accordance with Equation (l) „„„ s ,der draws without rcplucc- 

_. N ° W Jet us alter our model slight* a"d con * ^ arc no longer 

mcnt * Th en wc no longer have Bernoulli tria ndlllon for Bernoulli 

independent Note, however, that this is the o ' y d of po , 

trials which fads to hold. Since wc arc dealing w. lh . a P«^ 

Urn * the probability of black remains the same ‘ „ f r0 m the original 
"Hie result of r draws is to choose a subset of r bans 
6 + S balls We shall use a sample space !! consisting 
. \ 


Cr) 


192 Bernoulli rn.Ais / sea 



elements, one for each possible r subset and, assuming random drawing, 
assign the same probability 

Ct 7 ) 

to each simple event Again denoting by N r the number of black balls drawn, 
we have 



Equation (5) is derived by noting that ( N T => k) is obtained by choosing 
any k of the b black balls and any / — k balls from the g green balls Note 
that our agreement about binomial coefficients gives the correct value even 
if k > b In this case we require the probability of drawing more black balls 
than were originally contained in the urn This event, being impossible, 
must have probability zero But, of course, when k > b the binomial co- 
efficient 



is zero Similar comments apply if / — k > g 
A random variable whose probability function has the form of Equation 
(5) is called a hypergeometric random variable Such a random variable 
appears, then, when one counts the number of “successes” in a sample drawn 
without leplacement from a finite population composed of two types of 
individuals 


PRO 


BLEMS 


8 Use Formula (5) of Chapter 7 to show that the probabilities in (5) add 
to unity 

Example 4 A congressional committee consists of 6 Democrats and 5 
Republicans A subcommrttee of 4 members is chosen at random If R 
represents the number of Republicans in the chosen subcommittee, then 


P(R = k) = 



► 


Example 5 Probabilistic considerations allow us to deduce the likely 
properties of samples taken from larger collections (or populations) of objects 
For instance, gnen the proportions or Democrats and Republicans in a 

570 / DISCRETE RANDOM VARIABLES 



certain voting population, the probability that a sample of 10 contains 3 
Republicans may be computed 

The problem of statistics is just the opposite We wish to infer properties 
of the population as a whole from the properties of an observed sample 
To take a typical case, consider Sam Yerkes, running for mayor of Anytown, 
USA In order to determine his popular support, an opinion poll is con- 
ducted involving 500 individuals chosen at random and without re P “ eme "‘ 
from the total population of 3000 registered voters Of those polled, 300 

eXP TrpTortmn°; S o a "he 3000 registered voters on whose support Sam can 

count is still unknown Ify is the total number of voters supp indicate 
then p = y/3000 and the probability that 300 of 500 voters polled indicate 

support is 


AO) = 


According to the statistical principle of maximum llkeM '°° d ’^ makes'/”}) 
the best estimate of p is m/3000, where m is the value ofy which mj) 

a maximum greater than unity when 

A simple calculation shows that K j)/ Hj > | mcreases h(J) 

J < 1(3001) and less than unity wheny > s(3001) I mu , ' = ls00 . 

first increases and then decreases, reaching its ™ 3X1 = (he $ame as the 

The maximum likelihood estimate of p is 1800/3000 ^ 

sample proportion 


(300) 

/3000 - j\ 
V 200 J 

1 

f 3000^1 
k 500/ 

1 


PROBLEMS 

_ c v 7 times in a Riven month Assume that 

9. The Vultures play the Green Sox orobabihty l of winning and 

the Green Sox are the better team and have probamn y 5 

probability I oflosing a game , ria fa. find the prob- 

(a) If the games are considered as / in p 
ability that the Green Sox win exact y 
(i) four games 
(n) five games 

(in) a majority of the games r , ahstic » 

(b) Do you think the- assumpl.on oHndepen^ ^ ^ MOu|J wm 

10. (a) What is the probability in Proven , 

.r these- tvv o teams met m t « 01 (q ,, robLm 9(a)(iti) * 

(b) Why is the answer to (a) the same a 
u. (a) Show tha, the probabilities ,n Et,ua„on (0 J ^ ^ 

(b) Show that the probabilities ,n b nomla | expansion ) 

p > 0 {Hint interpret the sum as a binon 

’-'bcrK A 1 nlicr (1SX>-1962) British slamti-ran 

132 BERNOULLI TRIALS / 



12 A plane can fly on half its engines but not on less Assuming engines 
fail independently with probability q, would you rather fly in a two- 
engine plane or a four engine plane 1 

13. Suppose that 10 people are chosen at random from a community and 
asked whether they favor a school bond issue If it is assumed that 60% 
of the community favor the issue, what is the probability that a majority 
of those sampled will oppose 1 

14. In considering whether to develop an Atlantic coastal island, the South 
Sea Development Company estimates that there is probability 3 that at 
least one hurricane will hit the island during any given year The com- 
pany predicts that during the first five years the development would be 
slightly damaged if hit during one year, in financial difficulty if hit dur- 
ing two years, and a complete loss if hit in three or more years What 
is the probability that the company 

(a) would escape unscathed 1 

(b) would be slightly damaged 1 

(c) would be in financial difficulty 1 

(d) would suffer complete loss 1 

15. In Problem 14, how many years must elapse in order for the probability 
of suffering at least partial damage to be at least 0 95 1 

16 Given that n Bernoulli trials result in exactly k successes, show that the 
conditional probability of a success on any particular trial is k/n 

17 (a) In Example 1 of this section, let M denote the number of decision 

makers who are correct in all n decisions What is the probability 
that M = p 

(b) Find the probability that at least one of the m decision makers is 
correct in at least n — 1 decisions 

18 Suppose that the success of any one decision maker in Problem 17 is 
defined in terms of a perfect record on n decisions In order to form a 
brain trust, we decide to test decision makers until we find k success- 
ful ones Find the probability that the rth decision maker tested is the 
Ath successful one 

19. In Example 4, suppose that the subcommittee chosen contains at least 
one Republican 

(a) Find the conditional probability that all the subcommittee members 
are Republican 

(b) If another 4 man subcommittee is chosen from the remaining 1 
members, determine the conditional probability that all members 
are Democratic 

20 The No Sneeze Company has developed a new cold pill which they hope 
will be effective in reducing the incidence of the common cold In order 
to test their product, they have drawn a number of random samples 


572 / DISCRETE RANDOM VARIABLES 



A plane can fly on half its engines but not on less Assuming engines 
fail independently with probability q , would you rather fly in a two- 
engine plane or a four-engine plane’ 

Suppose that 10 people are chosen at random from a community and 
asked whether they favor a school bond issue If it is assumed that 60% 
of the community favor the issue, what is the probability that a majority 
of those sampled will oppose 9 

In considering whether to develop an Atlantic coastal island, the South 
Sea Development Company estimates that there is probability 3 that at 
least one hurricane will hit the island during any given year The com- 
pany predicts that during the first five years the development would be 
slightly damaged if hit during one year, in financial difficulty if hit dur- 
ing two years, and a complete loss if hit in three or more years What 
is the probability that the company 

(a) would escape unscathed 7 

(b) would be slightly damaged 7 

(c) would be in financial difficulty 7 

(d) would suffer complete loss 7 

15 In Problem 14, how many years must elapse in order for the probability 
of suffering at least partial damage to be at least 0 95 7 

16 Given that n Bernoulli trials result in exactly k successes, show that the 
conditional probability of a success on any particular trial is k/n 

17 (a) In Example 1 of this section, let M denote the number of decision 

makers who are correct in all n decisions What is the probability 
that M — j° 

(b) Find the probability that at least one of the m decision makers is 
correct in at least n — 1 decisions 

18 Suppose that the success of any one decision maker in Problem 17 is 
defined in terms of a perfect record on n decisions In order to form a 

brain trust ” we decide to test decision makers until we find k success- 
ful ones Find the probability that the rth decision maker tested is the 
1 th successful one 

19 In Example 4, suppose that the subcommittee chosen contains at least 
one Republican 

(a) Find the conditional probability that all the subcommittee members 
are Republican 

(b) If another 4 man subcommittee is chosen from the remaining 7 
members, determine the conditional probability that all members 
are Democratic 

20. The No Sneeze Company has developed a new cold pill which they hope 
will be effective in reducing the incidence of the common cold In order 
to test their product, they have drawn a number of random samples 


12 . 


13. 


14 


572 / DISCRETE RANDOM VARIABLES 



Similarly, if N denotes the score obtained in tossing a die loaded so that 
PQf = 1) = A P(.N=2) = To W = 3 ) = A 
IXN = 4) = * P(N = 5) = ft W = 6 > “ * 
then out of, say, ten million tosses, we would expect — one million 
each of ones, twos, and sixes, about t two ' m' obtamed by addlng a H the 
three million threes. The average i scor p t ^ ^ about 
scores and dividing by ten million (10 X /* 

1 • (1 X 10°) + 2 • (1 X_10j) + J (3 x £ XJ0VL6JU<J^) 

uTxTo 5 

/IX 10 6 \ /I X 10 s \ + 3 ( r AXl°l N ) + 4( 1 2>< 10 

N /ivir\ . 


WO X 10° 


0 


♦»(««) 

1(A) + 2(A) + 3(A) + 4(A) + 5(A) + «*> 


( 6 ) 


= 1(A) + 2(A) + 5(A) T ' '* u ' 

The average score one would expect may ^ “^“‘^eprobabihty P(N = o) 
each value u of the random variable JV by me respec 

and summing over all possible values. random variable and we 

Similar considerations apply to any 
formalize our discussion in the following definition 

Definition 1 Le, X be a discrete random variable h£-« • — 

*„ x* ,* .... Then the expected id*, or the mean 
" - u) 


£(X)= T. xlP< ~ X 


(7) 

► 


i n r i random variable A' is defined 
Wc shall agree that the expected value o 0 .i icrw isc. rearranging the 

only when the senes (7) ,s absolutely convergent Olhe 
terms could lead to a different sum nr u |$ ap|urc nt, then, that the 

Adding the terms in (6), wc find E( ) “ at . ssan k .t value one "expects 
expected value of a random variable is no c as , nd) e-ated above, it 

in a single performance of the experintcn . , p p r o\ini.ited after man) 
should be interpreted as the average value to 
repetitions of an experiment- 

V.™,,. , „ S «.■ — 

,h “ m „. ► 

„IV I <(l)4 ()( ) s 

„ |(i) 4 . 2 (?) + 3(J ) + ** ^ ^ n 

193 rxrccTto vaiui / 



27. Not only does a geometric random variable exhibit lack of memory, but 
it is the only discrete random variable with this property Specifically, 
suppose a random variable T assumes values 0, 1 , 2, with respective 
probabilities p 0 , Pi. P 2 , , and that the conditional probability that 

T = k + 1, given that T > k, is equal to the constant p for every k 
Prove that p k = (1 - p) k P, so that T has a geometric probability func- 
tion 

28. Imagine a game played over a sequence of I trials in which on each trial 
each of two players makes either a cooperative C or noncooperative N 
response * Each player makes his response without knowledge of the 
response of the other After every trial, each is informed of the other’s 
response One model for this experiment assumes that the two players 
respond independently of one another, each player’s responses consti- 
tuting a sequence of t Bernoulli trials and player A having probability 
Pa. and player B probability pn of making response N Under these as- 
sumptions, find the probability function of X, the number of trials on 
which both players respond N 

29. In Problem 28, let the random variables Na, Ca , Nn, and Cb denote, 
respectively, the number of N and C responses by players A and B Show 
that 

P(X *= A | Na = a and Nb — b ) 

" *'(" - k)'(m - WU -n-m + k)' PaPb ^ ~ P j) (1 ~ 

whenever t > n > k > 0, t > m > k > 0, and t > n + m - k 

30 Suppose in Problem 28 that in a sequence of 10 trials player A made 6 
N responses and player B made 4 This is event E 

(a) Find the conditional distribution function of X , given E That is, 
compute P(X < k \ E) for all k 

(b) Find P(1 < X < 4 | E) and P(X > 3 [ E) 

(c) If, in addition to event £, we observe X = 4, would you feel the 
players were choosing independently ? Present some calculations to 
support your conclusion 


19 3 EXPECTED VALUE OF A RANDOM VARIABLE 

In our relative frequency interpretation of probability, if the probability 
of heads for a given coin is \ y then we expect that in the long run about half 
the tosses will result in heads and about half in tails Put another way, we 
feel that the average number of heads in n tosses will be about n/2 

♦For a discussion of certain games of this type, see Rapoport, A and Chammah, A M , 
Prisoner's Dilemma A Study in Conflict and Cooperation (The University of Michigan 
Press, Ann Arbor, Mich , 1965) 


574 / DISCRETE RANDOM VARIABLES 



consists of two tests each of wh.ch results in either a positive + or negative 
- outcome The possible joint outcomes are then o i (+, +J, 2 

(+) “= (-, +),ando 4 = (-, — ) and we assume that the conditional 

probabilities of these outcomes are as shown in Table 1 f , ( 

In order to utilize the test information, the diagnostcan must fonnulate 
a decision strategy wh.ch tells him what diagnostic act, on ,0 take 


TABLE 1. Outcome probabilities 
conditional on patients 
health 


Patient s 


Outcome 

04 

health 

0\ 

°2 

03 

S 

0 75 

010 

010 

0 05 

H 

0 01 

0 05 

0 04 

0 90 


possible joint outcome of the diagnostic tests F r°™ th ^ ^hown in Table 2, 
that might be employed, only 5 will be»"s-dered ^eareshow 
T denoting treatment, and N no treatme 


TABLE 2 

Diagnostic 

strategy 


D\ 

D 2 

Di 

D a 

D s 


Outcomes 


0 2 

03 

04 

T 

T 

T 

T 

T 

N 

T 

N 

N 

N 

N 

N 

N 

N 

N 

: dngnoslic t< 


v.hilc 


s a policy of “when in doubt, ^^nontrcatnicnt rcgirdlcss of dng* 
- strategy represents a policy 
hostic information , f i u. ,t S expected xaltic for cich 

The worth of a decision stratcg> is ,nd,c ; > f ? r f r „, f# . and a in 
stale of the patient's health Using the '•> “ c lcd v ,| uc or stmlcp) P t . 
Trample 3, we find, for instance, th it >" c 

E"cn tint the pment issick. is ,, 

07 5 O01 + 0,0(-,) + 0,<- , 3, : - ( -_^ itdii 

193 rxrtcTto vAtor / stt 



Example 2 In the “numbers game,” an amount 6 is bet on a three-digit 
number, for instance, 636 A number is then chosen at random from the set 
{100, 101, , 999} If it is 636, the bettor receives 600 times the amount 

of his bet Otherwise, the bet is forfeited 
Letting R denote the net return when an amount b is wagered, we see that 
R — 6006 — b if the bettor wins and R — —b if he loses Hence 

E(R) = 599bP(R = 5996) + (-b)P(R = -6) 

= 5996(^o) - 6(f§&) = — $6 

Regardless of the amount bet, the bettor can expect to suffer an average loss 
equal to one-third of that amount ^ 

Example 3 Scheff* contends that in the face of uncertainty, medical 
practitioners frequently adopt a decision policy that regards continuation of 
treatment as the less costly alternative He suggests the use of “expected-value 
equations” as an alternative strategy 
For instance, the expected value of a treatment decision is 

E t = p c d -f (1 - p)c n 

where p is the probability that a patient has the disease, c d is the treatment 
value for patients who have the disease, and c„ is the value for patients who 
do not The expected value of the cost of a non-treatment decision is 

E n , = pc a -I- (1 - p)c h 

where c, denotes the value to a person treated as though healthy but who is 
actually sick and Ch is the value of non treatment for a person who is 
correctly diagnosed to be healthy 

As a simple illustration, suppose c d = 10, c„ = -2, c s = -15, c h - 5, 
and p = l Then 

E t = K10) + f(— 2) = 04 

and 

- !(-15) + i(5)= 1 

Nontreatment is indicated since, on the average, this decision has the greater 
patient value ^ 


Example 4 The analysis in Example 3 is somewhat deficient m that, 
assuming fixed treatment values, the decision policy for a particular disease 
depends entirely upon the probability p that the patient is ill In practice, 
knowledge of/> is augmented by diagnostic tests For simplicity, let us assume 
that the patient is either sick Sor healthy H and that the diagnostic procedure 

•Scheff, T J , “Decision Rules Types of Error, and Their Consequences in Medical 
Diagnosis,’ Behauoral Science 8, 97-107 (1963) 


576 / DISC R ETE RANDOM VARIABLE: 



Thus, the expectation of a bmonual landoni vanable is 

JL /r — l\ i r k ( r " A n t-i„[(r-D-a-Di 

W r) = r E {[ _ ,) P‘1 =rp ^\k-l) p q 

Making the change of variable t = k — 1 shows that the sum is simply the 
binomial expansion of (p + g) T 1 = l r 1 — 1, and we o tain 

E(Nr) = ip 

Example 5 If from a cty in which 60% of the voters are Republican, a 
sample of 70 voters is taken with replacement, the expected n 
Republicans in the sample is 

E(R ) = ip = 70(0 6) = 42 

Of course, this is not to say that we will necessarily q^q 

containing exactly 42 Republicans For instance, there 1 P ^ 

that R is 40 or less, while P(R > 45 ) = 0 26 

Example 6 Let A - be a random variable with the geometric probability 

function 


P(X = k) = pg k 
Then the expected value of X is 


k = 0, 1,2, 


E(X) = E = pq f kq 

k = o * -1 

Since kq'-' = d(a')/dq, we have, using Theorem 4 of Chapter 16, 

rf(l - q)- 1 .. r/ ET-.o <7* = y = E kc t~' 

=0 ,_1 


dq 

an d it follows that 


dd - < 7 > _ 

£(*) = pq 

= <7 

p 

k of “fu lures” before the first 
T his quantity is the expected number ‘ jn succcSS i\c tosses 
Recess in a sequence of Bernoulli trials o , j before the first ace 
fair die, the expected number of non-aces 
is 

t = 5 

r Example 5, the mean number of Republicans encountered before the 

lrs t non-Rcptiblican is 0 6/0 4 — 3 

193 EXPECTED VALUE / *7D 



Other expected values are similarly computed and are summarized in Table 3. 


TABLE 3 

Patient s 
health 

Dx 

Strategy 

D 2 03 

D t D 5 

S 

10 

8 75 

6 25 

3 75 -15 

H 

-2 

4 30 

4 58 

4 93 5 

Expected 





value 

04 

519 

4 91 

4 69 1 


Since p = P(S) = £, the entries in rows 5 and H of Table 3 may be 
combined to give an overall expected value for each of the decision strategies. 
For instance, the expected value for strategy Z) 2 is 

£(8-75) + |(4 30) = 5.19 

Since this is the largest among the five strategies considered, jD 2 should be 
selected SchefPs recommendations (Example 3) result in choosing strategy 
Z) 5 . However, as the present example shows, the expected value can be 
substantially increased by the choice of a strategy in which action is con- 
tingent upon diagnostic results. ► 


PROBLEMS 

1. Let T be the total score obtained when two fair dice are tossed. What 
is E(7T> 

2. If X is the number of successes in a single Bernoulli trial, what is 
E( X) 1 * 

3. If Y is the number of successes in two Bernoulli trials, find E(Y) Can 
you guess the expected number of successes in r Bernoulli trials 9 


The mean of the number N r of successes in / Bernoulli trials is 

E(.N r ) - ± kP(N r -k)-± k ( r \ p> q r-L 
1,0 W 

Since the first term is zero, we might as well sum from k - 1. When k > 1 
the coefficient 


may be rewritten as 



k X r' 
k'{r - k)' 


578 / 


DISCRETE 


r(r ~ IV 

(* - l)'(r -ky- r 
RANDOM VAR1AB L 



ES 



Theorem 1 The followmg l.st constitutes the baste properties of expected 
\ alues 

(i) If c is a real number and A a random '.triable, then 
C(cX) = rC(A') 

, ,c ,r x(o>) = c for all a in the 

(n) If A' is a constant random \ ari.ibli, that is. t > 

sample space, then 

E(X) = c 

(in) If g(A) and /i( A) arc two functions of the random variable A, then 
£(g( A) + HX)) = + £(, ' W) 

proof (l) Let g(A) = cA Then from (9), 

£(g(A)) = E cxP ( X = v) 

= c E - v) = <£(*) 

X 

(n) Let g(A) = c Then 

£(g(A)) = E cP ( X = - v) 

= «E«x- *)- c i = <• 

On) 

£(g(A) + /i(A)) = E feW + = X) 

4*w-hS*w-' ^ 

= £(g(A)) + £(/i(«) 

problems 

6 Suppose £(X) = 2 and £( A 2 ) = 6 F '" d ^ £(Jf(Ar _ 1}) 

(a) £(4*) ( b > 3X) , 

h Y is a random variable, then 

7. Prove that if a and b are constants an 

E(flX + W = + * 

, , a value of the function g(X) = 

lf A is a random variable, the expec e va riable X and denoted 

* - W is called the vanance of the 
Va ' (A) Thus, using M to stand for the constant £(X), 

_ v (x - rf p <- x = x) 1 

Var (A) = £((A - 

1 for the term “variance,” 
pie last term in (10) makes clear the ratl °™ E h t i, e values of A' differ 
that sum ,n a sense measures the amount by which 

193 EXPECTED VALUE / 581 



SuDDOse X is a random variable with known probability function and that 

Z?So- Z :i— E - h ™ k ' C of X* s mapped by gmto exact tly 

one Z value, while a single Z value may be the image of f 

values If we denote by x h , x l2 , x ls , . . - those X values for which «(»,) zt, 
then the event (Z = z A ) becomes 

(z = z,) = U ( x = *0 


Its probability is 


P(Z = z A ) = £ K* * *0 


£(Z) = £ z A P(Z - *0 “ £ S “ **•) 

i. ^ 

= 52 X) = *0 


( 8 ) 


since, for each /, g(x A ,) = z A « 

Noting that each X value appears exactly once in (8), we see that i 
expected value of Z = g(X) may be computed without knowing the prob- 
ability function of Z at all For each possible value x of X, one multiplies the 
value g(x) of the function g at x by the probability P(X = x) and then sums 
these products That is, 

£<«(*» - Z swnx = *) t9) 


Example 7 If X is the number obtained in tossing an honest die once, 
then 

EW - i?) - (1 - i ?P(X - 1) + (2 - IfnX = 2) 

+ (3 - i) 2 P( X = 3) + (4 - i) 2 P(X - 4) 

+ (5 - i) 2 P(X = 5) + (6 - i) 2 P(X = 6) 

= V(i) + i«> + id) + i(i) + Itt) + ¥(« 

- a “ a ► 


PROBLEMS 


4. Find the probability function of Z = {X — ^) 2 (Example 7) and then 
use Definition 1 to calculate £(Z) Compare with the result of Exam 
pie 7 


5. Let N denote the number of successes in r Bernoulli trials Interpret the 
following sums as expected values of certain functions of N. 


(a) Z 2*>(W - *) (b) T, (4 - rpf Q pV ‘ 

w) sKO a ’ j 


580 / DISCRETE RANDOM VARIABLES 



which is the first formula Writing X~ = X(X - 1) + X, so that 
E (X-) = E[X(X - 1)] + E(X) [Theorem 1 (c)l 


gives the second 

Note that multiplying a random variable by a constant multiplies the 
tariance of the random variable by th e square of that constant = 

variance ,s defined as the expected value of a squared quantity The square 

- °r «• — -I x r r"" — 

SD(X) Theorem 2(a) implies that SD(cX) - * , 

to a random variable does not change the amount of devia 
from the expected value and hence (Theorem 2(b)) does not change 
ance at all 

Example 9 Continuing Example 4, define a random variable ^assomated 

with decision rule 0, which takes k * ® 1 ^ able ’ 3 shows that 
according to whether the patient is sick or h y 
E(X,) = 0 4, while 

E(Xl) = «10) 2 + t(-2) 2 = 2 

Hence 

Var (A",) = E(X\) - [£(*i)l 2 = 23 2 - (0 4) - 23 04 
SD(.Xi ) = \223"04 = 49 
Defining random variables Xo, 


, *5 Similarly, we obtain Table 4 


TABLE 4 Standard deviations of decision 
strategies 


Strategy I 

0, 

02 

03 

04 

Os 

Standard | 
deviation 

| 4 9 

1 78 

0 69 

0 51 

80 


“M. \ due or to minimize van itton 

choice of whether to m iximizc expected ^ 

^ust be resolved b> the practitioner 

Exampto 10 If ,V r denotes the number of successes in r Berno 
ll,Cn / ,\ t , 
t W A "‘ 


19 3 


expected va 


IOE / EM 



or vary from the expected value p The variance is large when there : is arge 
probability that stakes a value which d.ffers substantially from M, and small 
when the values of X cluster closely around p The variance of X is ne 
negative (no negative terms appear in the sum) and is zero only when the 
only possible value of X is fx itself 


Example 8 Example 1 shows that the expected value of the number X 
obtained m tossing a fair die is 


E(X) = £ kP(X = k) = \ 
1=1 

Hence the variance of X isff, as derived in Example 7 


► 


Theorem 2 Let a and b be constants and A" be a random variable Then 

(a) Var (aX) «= a° Var ( X ) 

(b) Var (X + b) = Var ( X ) 

(c) Var (X) = E(X*) - [E(X)] 2 

= E(X(X - 1)) + E(X) - [E(X)] 2 


PROOF 

(a) Var (aX) = E([aX - £(oA r )] 2 ) 
“ E([aX - aE(X )] 2 ) 
= E(a 2 [X - £(Af)] 2 ) 
= a 2 E([X - £(Af)] 2 ) 
= o 2 Var (Af) 


[Definition of variance] 
[Theorem 1 (a)] 

[Theorem 1 (a)] 
[Definition of variance] 


(b) Var (X + b) = £([* + b - E(X + fc)] 2 ) [Definition of variance] 

' ■ E([X + b — E{X) — 6] 2 ) [Problem 7] 

= £([Jf - £(X)] 2 ) 

= Var(V) [Definition of variance] 

(c) Var (X) = £([* - £(*)] 2 ) 

= E(X* - 2XE{X) + [E(X)] 2 ) 


Using Theorem 1 gives 

Var (X) = E(X 2 ) - 2E(X)E(X) + [£(J0] 2 
= £(* 2 ) - [£( A-)] 2 


582 / DISCRETE RANDOM VARIABLES 



12. You are offered the following proposition: Toss a fair die until some- 
thing other than a “one” appears uppermost You pay 50/ per toss and 
if n tosses are required, you win $2" 

(a) Find the expected value of your winnings (gross) 

(b) Find the expected value of your costs 

(c) Find the expected value of your net winnings Would you be willing 
to play the game 7 

13. In Problem 12, how much would you be willing to pay per toss if the die 
were replaced by a fair com and you were required to toss until tails 

appears 7 

14. Let X be a random variable with E(X) - »••■*]£ = ‘ 

Z = (X - p)/<r Show that E(Z) = 0 and Var (Z) - 1 (A 

called the standard, zed random variable corresponding to X) 

15. If X ,s the number of successes ,n » Bernoulli trials, find the expected 
value and variance of X/n 

16. Show that if X is the number of “ S “ CC ^“” ^men^of which s'- 
without replacement from a populatio 

are designated as successes, then* 

X vifw = min - r) /O' - » " re W1 ' h £ ’It.ve ef- 

(c) Find hm»_,» Var (Y) What do 

fects of sampling with and withou p 

, n i 2 n with equal prob- 

17. A random variable U which takes va ■ • ■ arm y e show that 
abilities l/(n + 1) is called a uniform random 
E(U) = n/2. 


18. The probability function of U m 'p{U = (" /2 > “ “ 

the point n/2 in the sense that for al rea function of any random 

- (n/2) + cl Prove that m 


•uiiiiuit la • 

ft cd at one of possible 
19. In a training device a malfunction «s ,ntr ° . ocat , on s successively until 
locations and the trainee is require A« u ming he makes no errors 

he locates the source of the malfunc *° expected value, and van* 

in his diagnoses, find the probabi ity u ’ ^ source of the mal- 

ance of the number T of trials required to locate 

function if , „ , the n possible locations 

(a) on each trial he selects at random one 

(b) he chooses only from locations not pres mu 

20. Prose that Tor am real number r and a random s.m 'We •' 
C([ .V-cl^Var(A)-l IW-* 

Therefore show that /.(.V - rf is a minimum sshen c - « ' > 

EXPECTED VALUE / MS 



Since the first two terms are zero, and for k > 2, 

m - i) (') = ul - i) = '•('•-» (fc I 2) 

we have 

£(iV,(JV r - 1)) = £ r (''- 1 )(( _ 2) pLqr ~ l 

= r(r - l)p 2 

Hence 

Var (tf r ) = £(A^ r (iV r - 1)) + E(N r ) - [E(N r )] 2 
= r(r - 1 )p 2 + rp - r 2 p 2 

= »/>(! - />) = m 


PROBLEMS 

8. A public opinion pollster finds that 40% of his mailed questionnaires are 
not returned For these questionnaires, he initiates a followup which, due 
to extra labor, postage, handling, etc , costs S5 per questionnaire For 
an initial mailing of 1000 questionnaires, what is his expected followup 
cost 0 

9. A 1967 survey of 61 male 1948 high school graduates showed the fol- 
lowing numbers of children* 


Number n of 
children 


0 1 2 3 4 5 6 7 Total 


Number of graduates 

having n children 2 3 22 17 13 2 1 1 


Let C be a random variable denoting number of children and suppose 
that the above figures indicate the probability function for C, that is, 
P(c = 0) - , P(C = 7) = .gL and P(C > 7) = 0 Find the ex- 

pected value and the variance of C 

10. Let X be a binary random variable with P(X = 1) = p and P(X = 0) = 
q = l - p Show that E(Af) = p and Var (X) = pq 

11. Show that if X is a geometric random variable with P{X = A) - P^' 

^ = 0. , then Var (A0 = q/p 2 ( Hint Compute E(X(X — 1))’ 

using Example 10 as a guide ) 

R • Bohlen, j M , and Beal, G M , “Some Social Changes in Rural Iowa, 
1943-1967, Rural Sociology Reports f69, Iowa State University, Ames, Iowa, 1967. 


584 / DISCRETE RANDOM VARIABLES 



TABLE 5 


\ 

0 

0 

0 

71 

1 

1 

0 

1 

4 

1 

8 

3 

8 

2 

l 

8 

1 

4 

0 

3 

8 

3 

1 

1 8 

0 

0 

1 

8 


Probability 
function 
for X 


Probability 
function for Y 

The probability functions for the individual random van ables appear in 

the margins of the table and are found by sumnung the element m the 

appropriate row or column. For instance, the pro 
(X = 2) is obtained by adding the probabilities 

p(x = 2, y = o) + p(x = 2, y = i) + HX - 2. y — 2 ) 

For simplicity, we have introduced a new notation here for ^ ^ ^ y _ qj 

of esents. The intersection is denoted ^ “^section. of events’ (T = 2) 
denotes the simultaneous occurrence, or the mtcrsec ^ 

and (y = 0). 

If A'and Tare any two random variables defined on the same sample space, 
the function px.Y defined for all real numbers a an 

p.y.i (>,;•) = '’( a- = *. >’ = •>') 

'stalled the joint probability function of A a nd ! ' Thcn lhe c s cn t (A' = x,) 
Suppose that Y has values y y * ; T»; ; . . ' q of thc c s e nts O' - M 
ooours, if it occurs at all, in conjunction "HU on 
- yX). (T = .!■:,) That is, 

(A’ - *,) = U IX “ V = 

"Sfc thc union is tal.cn oxer all '.duo 5 ->» of ,hc r:,nllon ’ 

'hnc events arc disjoint, it follows that 

|-(A ^.Y,) -E V " n) 

l ,* ■ „ (111 MVS ib'l the rroKibibtJt'f 

la terms or the joint prob.ibilits function, ( ' ro Kibilu> function of A 

x t) nu> be obtained by summing J 

JOIMTIV DIStninUTlO / «» 



21. A manufacturer of summer beach clothing must begin production in ad- 
vance of the seasonal demand period He estimates that if x units are 
produced and total demand is d units, his profit R is 

($150* forO <x<d 

R(x, d) = { 

50d — 0 50 (x - d) for x> d 

Let us suppose that demand D is a random variable with probability 
function 


d 

1000 

2000 j 

3000 

6000 

P(D = d) 1 

01 

04 

| 03 

02 


Assuming that all production is in lots of 1000 units, how many units 
should he produce if he wishes to 

(a) produce for the expected demand 9 

(b) maximize expected profit 9 

(c) minimize the probability of a loss 9 

(d) maximize the probability of a profit of $3000 or more 9 

(e) minimize the variance of profit 9 


19 4 JOINTLY DISTRIBUTED RANDOM VARIABLES 

While our preceding discussion has dealt with random variables singly, it is 
often the case that there is more than one random variable of interest relative 
to an experimental situation Each random variable, of course, has Its own 
probability function, but the random variables taken together have a joint 
probability function and are said to be jointly distributed 

Example 1 In three tosses of a fair com let X be the total number of heads 
and let Y be the number of tails on the first two tosses The sample space, 
together with the corresponding values for the random variables X and Y, is 

tl = {HHH, HUT, HTH, HTT, THH, TTH, THT, TIT } 

X 3 2 2 1 2 1 1 0 
Y 0 0 1112 12 

The joint probability function of X and Fis shown in Table 5 For instance, 
the event that (X = 2) and ( Y = 1) simultaneously is the event {HTH, THH > 
which has probability J, while the probability of (X = 1) n (Y = 0) is 
zero In general, the entry m row r (where i = 0, 1, 2, 3) and column J 
(where j = 0, 1, 2) is the probability that ( X --- i ) and' ( >' , y) 


58S / DISCRETE RANDOM VARIABLES 



TABLE 7 Joint probability tabic for X and Y 

Values of / 


1 

2 

Values 3 
of X 4 

5 

6 


0 

0101 

0 079 

0 045 

0 036 

0 026 

0143 

0 

0 036 

0 026 

0 015 

0 008 

0112 

0 034 

0 

0 025 

0010 

0 006 

0 078 

0 025 

0 023 

0 

0 007 

0 004 

0 054 

0 018 

0011 

0 007 

0 

0 003 

0 038 

0010 

0 009 

0 006 

0 005 

0 


Probability 
function for X 


0 287 
0 228 
0187 
0137 
0 093 
0 068 


PROBLEMS „ , . 

ToWp 7 for the remainder of the prob 

'■ 25 5 «... -"- r 

> a— (II) and (I.) — 155 “ 

P(X = 0) = FIX = 1) = 4 “d - ®> bili \ ables for A' and Y, thus 
Write two different possible joint P ( f gnc tions is not suf- 

showing that knowledge of md.v.dua P f^n 
ficent to determine the joint probability functio 

X As, and r With 

Consider two random variables X with ™ ““ j*’ */ probability function 
values y u y 2 , y 3 , For each value Ft. the conditional y 

°f -X, given Y = y k> is defined by 

. P( X = x„ Y = _n) J= 1,2,3, 

P(X = x, I Y = Ft) = — 

Note that there is a different 
In this formula, k is held fixed while j varies conditional probability 

Probability function for each Ft Similarly, 
function of Y, given X = x„ is given by 

plX = x ,. Y = F t) A = 1,2, 3, 

p (Y - | X = Xj) = 

Two random variables are called independent ™ o( - inc jcpcndent et ents 

‘fetery pair of events (A' : x'j) and ( 1 — 3* 

Thus A and Fare independent if 

nx-x^Y-yO-K*-'***" 1 * 

194 JOINTLY DISTBIBUTED / »» 



and 7 over the values y„ y» of 7 (The number of values of 7 may be 
either fimte of infinite Thus, no specific limits have been indicated in (11)) 
A similar argument yields 

P(Y = y = = (12) 


That is, the probability of (7 - y„) is obtained by summing the joint 
probability function over all values of X 


Example 2 Bales and his associates* obtained Table 6 of who-to-whom 
frequencies of verbal interaction in a six-member discussion group Let X 
denote the rank order of the individual initiating a verbal communication 
and Y the rank order of the individual to whom the communication is 
directed Then assuming that the frequencies in Table 6 represent the true 
probabilities of the various communications, the joint probability function 
for X and Y is as shown in Table 7 


TABLE 6 

To individual 




1 

2 

3 

4 

5 

6 

Total 


1 

0 

1238 

961 

545 

445 

317 

3506 


2 

1748 

0 

443 

310 

175 

102 

2778 

From 

3 

1371 

415 

0 

305 

125 

69 

2285 

individual 

4 

952 

310 

282 

0 

83 

49 

1676 


5 

662 

224 

144 

83 

0 

28 

1141 


6 

470 

126 

114 

65 

44 

0 

819 


Total 

5203 

2313 

1944 

1308 

872 

565 

12205 


From Table 7, we see, for example, that the probability of a verbal com- 
munication from the first ranked individual to the fourth-ranked individual 
is P(X = 1, y = 4) = 0 045 Of course, any communication initiated by 
the first-ranked individual must be directed to one of the individuals in the 
group Thus the event ( X = 1) occurs in conjunction with one of the events 
( Y= 1),(1'= 2), , (Y = 6) so that 

P(X= 1)= E P{X= 1, Y = j) 

i—i 

= 0 + 0 lot + o 079 -f 0 045 + 0 036 + 0 026 
= 0 287 ► 

•Bales, R F , el a! , ' Channels of Communication in Small Groups.” American Sociological 
JtedeH 16, 461-468 (1951) 


588 / DISCRETE RANDOM VARIABLES 



Our discussion so far has been restricted to two random variables How- 
ever, most of the above considerations may be extended to any number of 
random variables For instance, for three random varia es , , an we 

define the joint probability function p\ i z by 

p^- l z(x,y.z)-P&-x,Y = >,Z = z) 

Then the probability function of X alone is 

P(X = x,) = E E p (X = = = Zl) 

k l 

obtained by summing the joint probability function ove ^ ™*' le 
and Z having positive probability, that is, summing over ‘ i> 

* An important 2 probability model involving several ni idem ««««««» 
generalization of Bernoulli trials to allow more t an w0 P j 

on each trial Such a probability scheme is called mutmwnval t„a,s 

multinomial trials 

(1) Trials are independent q 

(u) Each trial results in exactly one of the outcomes i, 2, 

( m — 2 ) p(f) \ are 

(tn) The probabtht.es = WA P* = *££ J ft “ ^ 

the same from trial to trial (Of course, 2 -m 1 P* 

, . Y Yn , X , where X x 
It is natural to introduce m random vanab 1 , » / trials For each 

denotes the number of occurrences of outcome U t , 5 , nomia l prob- 

fixed value of , (1 < , < m). the random - r '‘"‘ nd ent and“relat,ve to 

ability function This is because the trials are ^ occurs or it does 

outcome 0„ there are two possible outcomes ju the probability 

not with respective probabilities p, an P‘ 

function of X, is 

P(x, = k) = (j.)/>K i - P') r-< *• = 0 2 ' 

h trial a sample space for 
Since there are m possible outcomes on eac^ ^ ^ eleme nts, each de- 
mu ltinomial trials contains m independence of trials, 

noting a possible sequence of r outcomes ^ cau , t , me s, £?-> occurs As 
an V particular sequence m which outcome O, ocau* 
tlm «, , 0„ occurs k,„ times is assigned probabili > 


seen (Problem 18 Section 7 


- aten ^rrooiem 10 

ls the multinomial coefficient 


2)thl . ,he number of such sequences 


aOINTLV OISTB.BUTHO / SSI 



for every pair of values (x„ yt) In this special case, all conditional probability 
f jnctions are identical In fact, we have 


P(X=x,\Y = 


P(X = x„ Y = y,) 
p(r = n) 

P(X = x,)P(Y = y L ) 
P(Y=yt ) 


P(X = x ,) 


Example 3 From Table 7 in Example 2, we compute 

v _ i I y _ 4 \ = P(X= UY= 4 ) _ 0 045 = 3 

r(X — 1 | x — 4) — />( y es 4) 0 109 

Since P(X = l) = 0 287 is not equal to 0 413, we conclude that X and Y 
are not independent ^ 


Example 4 Suppose X and Y are independent random variables having 
the same geometric probability function 

P(X = k) = P(Y « k)~ pq K k = 0, 1, 2, 

If Z is the sum X Y, the event (Z = n) occurs if X takes one of the values 

0, 1, 2, ,n and Y *= n — X Thus 

(Z = n, = 0 (X = J, Y = n - j) 

3-0 

and since X and Y are independent, 

P(Z = n) = E P(X = J)P(Y = n- j) 

3-0 

= T, pq’pq n ~’ = (n + l )pV 

J— 0 

Hence the conditional probability function of AT, given Z = n, is 

P(Af = j | Z = ») = _ P(X = j, Y = n - j) 

P{Z = n) ~ P(Z = n) 

= = 1 
(« + l)p2^n n j 

Given that Z = n, X has a uniform probability function ^ 


The following theorem is frequently used in applications of random 
variables Its proof is omitted 

Theorem 3 Let X and Y be independent random variables and suppose 
that Z = h(X) and W = g(y) are functions of X and Y, respectively Then 
Z and W are independent ► 


590 / DISCRETE RANDOM VARIABLES 



Example 6 Let X„ X 2 , . . . , X„ be independent random variables each 
having the uniform probability function 

P(X, = k) = ~ for k = 1, 2, • • - , N 

Let us find the probability function of the largest of the X„ a random variable 

" “i°onis y affollows: The probability P(Z = r) is difficult to obtain 
directly so we use an indirect approach. Note that he event (Z < r) 

partitioned into the events (Z = r) and (Z < i )• enc 

P(Z < r) = P(Z = r) + P( z < r - 1) 
or . , 

P(Z = r) = P(Z < r) - ^( z ^ r ~ ) 

But (Z < r) if and only if all X, are less than or equal to r. That is, 

P(Z < r) = P(Xi < r, To < r, . ■ ■ , X„ < r) 

Since independence of the X, implies independence of the events (X, < r), 
(* 2 <r),...,(X„<r),wehave ^ . 

P(Z < r) = P(Xj < r) • P(X 2 < r) • ' ’ ( " - r 

= ^x^x---x^=(]v) 

Finally, 




= 1, 2, .... X 


PROBLEMS 


4. Returning to Example 2, y _ 2 . 

(a) find the conditional probability function o , S vcn ^ ^ ^ 

(b) find the conditional probability function o ■ , 

(c) find the probability function ofZ = A 

(d) find P(X - Y > 0). 

5. Suppose that A’ and V base the foltouing Joint probability ■ 


~ 0.13 0 05 0.10 0 3 u 

, 0 01 0 08 017 0 21 047 

2 000 000 010 002 

UO.NTLV DISTnlOUTEO / »> 


19 4 



Thus the joint probability function of the random variables Xu X 2 , 

X m is 

P{X t = k l9 x 2 - k 2 , , X m = k m ) 

for those combinations of values of k\, k 2 , , k m for which k t — r 

Otherwise this probability is zero 


L, k : . 0 ** * 


( 13 ) 


Example 5 In a study by Anderson* of attitude changes over time 
among a group of people use is made of the multinomial distribution 
Anderson postulates that at any given time, a person is m one of m attitude 
states To simplify our discussion, we consider only those N persons in the 
first attitude state at time 1 1 Each of these, independently of all others, is 
assumed to have probability p , (where j = 1, 2, , m) of moving to 

attitude state S } at time t + 1 Thus these N persons may be considered as 
constituting N multinomial trials If for j = 1, 2, , m, we denote the 

number who move to state S } by X J% we have 

P(X t = n „ X 2 = n 2 , ,*„ = «„)= ( N ) pVpV pZ’ 

Vi ii «2» . n m / 

For instance, if N = 4 and P\ = p 2 = pz = then 


P(*i = 2, X 2 = 2, * 3 = 0) = 


4> 

2 ' 2 ’ 0 ' 



_ _2 
~ 27 


► 


PROBLEMS 

3 Suppose that the daily closing market quotations for the Skidmore Com 
pany preferred stock form multinomial trials with outcomes O i, an 
increase in market value, 0 2 no change in value, and O 3 , a decrease 
in market value having probabilities and respectively Let 

*1 * 2 , and X-i denote the respective numbers of outcomes 0\, O 2 , a°d 
O 3 in a three day period For each choice of ki, k 2 , and kz what are the 
probabilities that X x = ki X 2 - k 2 and X 3 = k ^ {Hint A partial 
answer is 

/>(*, “ 0 , * 2 - 1, X, = 2) - 3(^X*) 2 = && 

- P{X 1 = 1,*2 = 1 ,Xi = »> 

P(ATi — k 1 , X 2 — k 2 Xi = * 3 ) = 0 unless k\ + k 2 + A .3 = 3 ) 

The definition of independence is extended to any number of random 
variables X U X 2 , , X n by requiring that the n events (*1 = *i)« 
{X 2 — x 2 ), , {X n ~ jc n ) be independent for every n tuple of values 

(*i» *2, , * n ) 

•Anderson T W ‘ Probability Models for Analyzing Time Changes in Attitudes ’ 
in Mathematical Thinking in the Social Sciences Lazarsfeld P F Ed (Free Press, 
Glencoe 111 1954) 

t Anderson's original model was a Markov chain 


592 / DISCRETE RANDOM VARIABLE: 



9. Suppose in Problem 7 we are interested only in the change from “Repub- 
lican” to “other” and from “other” to “Republican” at time t Then of 
eight Republicans, seven Democrats, and four Independents find the 
probability that 

(a) nRR = 5 iiro — 3 non = 2 mo — 9 

(b) tiRii + non — 7 nito + n oo = 12 

10. The output of a certain filling machine of the Crunch E Peanut Butter 
Company is regarded as a Bernoulli process with pro a i i y 
filling a jar within acceptable tolerances and 0 1 of not oing so pa 
ticular jar may or may not be inspected with proba i i y an 
respectively 

(a) If the decision to inspect any particular jar is made without P 
knowledge as to fill level, what is the probability that ofe.^tjars 
filled by the machine Four are not inspected an are wi 

able' tolerances, one is inspected and is within acceptabk tolerances, 
one is inspected and is not within acceptable to erances, ^ 
not inspected and are not within acceptable fi to erance 

(b) Ajar is "passed” if the jar is either within tolerances and « » mspected 
or is not inspected What is the probability that of seven j P 

five are within acceptable fill tolerances 9 

(c) Show that if k of N jars are passed, the conditional probability that 
exactly n are within tolerances is 


0 


1 — P) 


where p = 0 9/(0 09 + 081 + 009) ^^^^tsbe^patsed 
is within acceptable tolerances given that the j 
(d) Define the random variable X as the percent of exactly A 
are within acceptable tolerances If in • of the 

jars are passed, find the probability that (n/k) X P 
A jars are within acceptable tolerances 
(i) given the previously defined inspection plan 

00 if no inspection is performed „ ro „m<>nt 

How might you use the ratio of .he two probabilities as an argumen 
for the adoption of the inspection plan 9 
(a) EetXand V be independent random variables each of 
geometric probability function Find >nc I Fxamp Ic 6 ) 

2 = larger of X and Y (Hm, Proceed as in Examp. > 

<h) Obtain the probability function of 2 by firs, deriving 
ability function of A' and Z 
( c ) Hrid the probability function of fi — A 4* 

,J - Show that ,r .Vi, A'-, , AT. arc independent random xari a. is 

l hc same geometric probability function, then ' 1 result >n terms of 
a negative bmomul random variable Interpret this r» 
u Jit mg times 

„4 -.O.NTLY DISTRIBUTED / •* 



(a) Find the probability function of the random variable Z - X/Y 

(b) Show that X and Z are dependent random variables 

6 Two subjects are observed on 10 trials of an experiment On each trial, 
each subject may choose to respond cooperatively C or competitively D 

(a) Assuming that the players respond independently and that the re- 
sponses of each player form Bernoulli trials with respective prob- 
abilities pi(C) = 5 and P2(C) = 3 of choosing response C, what is 
the probability of obtaining the following joint response frequency 
table* 


Player II 


Player I 


C D 


C 1 2 


D 1 6 


(b) Find the probability of obtaining exactly k number of DD responses 
given that Player I makes three C and seven D responses and Player 
II makes two C and eight D responses 

7. Assume that a person who votes Republican m one election will in the 
next election vote Republican with probability f, vote Democratic with 
probability and vote Independent with probability Assuming that 
voting behavior is independent from one individual to another, what is 
the probability that of eight people who vote Republican in the Spring 

(a) five vote Republican, two vote Democratic, and one votes Indepen- 
dent m the Fall* 

(b) at least one voter changes in the Fall* 

(c) the number who change equals the number who do not* 

8 Assume that in Problem 7 the respective probabilities that a Democratic 
voter next votes Republican, Democratic, or Independent are 0 1, 0 7, 
and 0 2, while the corresponding probabilities for an Independent are 
01, 01, and 0 8 What is the probability that of eight Republicans, seven 
Democrats, and four Independents 

(a) At least one voter in each party changes'’ 

(b) Exactly one in each parly changes’ 

(c) Writing n„ for the number of Republicans who again vote Repub- 
lican, n RD for the number of Republicans who next vote Demo 
cratic, etc , what is the probability that 

n RR = 5 mro = 2 tiRi = I 

n DR = 1 kdd = 4 n DZ = 2 

«/« = 1 «/z> = 1 mi - 2 

594 / DISCRETE RANDOM VARIABLES 



function of X without knowledge of the probability function of g(X). 
Specifically, from Equation (9) we have 


E(g(X)) = E g(M)P(X = ») 


( 14 ) 


An argument similar to that leading to (14) may be used to establish the 
following more general theorem, the proof of which is omitted. 

Theorem 4 ]fg(jr,, X ,, . . . , X„) is any function of the random variables 
X!, X 2 X„, then 

E(g{X u .... X n )) _ 

= E s(*i,*a x„)P(X, = *i> = Xn ~ Xn> 

*1 

the sum being taken over all n-tuples (.v i, x*, - • • j - Y n) f° r which P(X i 
X 2 = X„ = x„) is positive. 

An especially important case of this theorem ar‘ s ®s wheng is the^ 
which sums the given random variables. Thus, i 8\ > ) 

E( g(X, 7)) - E(X + 7) = E E ( x >- + >’’) P(X = Xl ’ Y ~ y,) 

x k V/ 

We break this into two separate sums 

E E MP(X = X,, y = 7,) + E E yA* = y = 

V, * * 

and sum the first of these over j first and the second over k first. Recalling 
that 


and 


E P(X = xi, Y = j',) = P<X - Xl) 

V, 

E P(X = Y = J’,) = P( r = y ’ } 


E(X + Y) = E Xi- r ( x = XL> + If y,r( 1 ' lj) 


= E(X) + E(Y) 

We have shown that the expected value of the sum of two random variables 
ls the sum or the respective expected values. number or random 

This result can easily be extended by m uc 1 j/ a % implies that for 

variables (see Problem 6). This, together with Theorem U 

any random variables A'„ X, =>" d conS,an ‘ S " ' 

*.*» + c 1 *i + - + r.W +cM X») (15) 

= c,E( X.) + C - E( ^ x =) + 

SUMSOrUANOOM VAR1A0CUS / 531 



L3. Let A'i, X 2 , , X n be independent random variables each having the 

uniform probability function 

P(X, = = L k- 1,2,3... , N 

Find the probability function of l/„, the smallest of the X, 

14. Prove that if X and F are independent binomial random variables with 
respective parameters («, p) and (m, p), then Z = X 4- F is a binomial 
random variable with parameters n + m and p 

15. Suppose “at bats” form multinomial trials with possible outcomes 
H = hit, W = walk, and O = out, having respective probabilities yo> 

and -fe If Xn, Xw, and Xo denote the respective numbers of out- 
comes H, W, and O in three at bats, what are the probabilities that 
(Xn - a, Xn = b, Xo = c) for all possible choices of a, b, and c 9 

16. In information theory,* the basic information function H is a measure 

of the average amount of information or uncertainty defined over the 
values of a random variable If A’ is a discrete random variable with r 
values xi, x 2 , , x, and p, = P(X = x,), 1 = 1, 2, , r, then H is 

defined by 

H(X) = - £ />» 1°82 Pi 

Similarly, for two random variables X and Y , the joint information 
function is defined as 

mx, Y) - -Zp„ioe2P„ 

where 

Pu = P(X = x„ F = }] ) 

(a) Prove that if X and Y are independent, then 

H(X, F) = H(X) + H(Y) 

(b) Prove that if X and Y are dependent, then 

H(X) + H X (Y) = H(X, Y ) 

where 

//a(F) = -EEpu log 2 p,(j) 

P'O ) = P(Y = y } | X = x.) 


19 5 SUMS OF RANDOM VARIABLES 


We have already seen that if Ais a random variable and g(X) is a function of 
X, then the expectation E(g(X )) may be found directly from the probability 


For a brief nontechnical discussion of information theory, see Luce R D , “The Theory 
of Selective Information and Some of Its Applications Part I,”’ in Developments in 
Mathematical Psychology, Luce, R D , Ed (Free Press, Glencoe, 111 , 1960) 


596 / DISCRETE RANDOM VARIABLES 



Since it is not known with certainty when each board will fail, the choice 
of a maintenance policy can be regarded as a decision under uncertainty 
Consider unit intervals of time as independent trials and suppose that a 
trial 0, all circuit boards are newly installed For the ith board, let the random 
variable X, denote the number of trials up to and including the trial on which 
failure occurs Then, the probability that a new ith circuit board will fail on 
the kth trial is 

P(X, = k) = v<t 

where p is the probability of failure and q is the probability of su ™™' j" 
addition, assume that the random variables X„ X* , associated with 
the different circuit boards are mutually independent , . i* 

As each board fads, ,t ts tmmed.ately replaced by a new board, which 
in turn replaced when it fails, and so forth Suppose that a failure occurs a 
trial t This may occur if a single board fads at time i > secon( j 
board fads and is replaced by a second boarc 1 whirf < ^ ^ 

ssz:;: “ i 1 ™ 

occurs on the (th trial may be either the first failure, o 

or , or the rth failure , , „ tna t 

Let /W denote the probability that the rth failure occurs on the 

Then 

f\ r) = P( A", + A’j + + x ' = ’’ 

-C, 

X.. , X, ore independent^ random ‘n, 

geometric distribution P(X = A) — PQ » ’ 

SC Ut;'be 4 L probability that a failure oceurs a, .he ,.h mal Then 


!=/,‘ »+/i" + +/' 

-sC-0- 

(r - o' 


+/!' 


_ ^ oi.ntlv die expected number of 

The expected number of failures or, cqui'^n 
rc Placcmcnts on ihc /th iri il is gi\cn b> 

sp ^ 7 

Tlic expect Hion principle c in now be used ,n ~ rc pl Kcment 

r^tiods (trills) the present in e m unten mcc po 7 all circuit Nmu!s 

of ail circuit bo irds ut a cost of A C, Tor all in **' < 

19 5 SUMS or nANOOM VAUIAHUS / •* 



Example 1 The random variable N„ denoting the number of successes 
in r Bernoulli trials, may be written as 

N, = X, + X 2 + + X, 

where X 2 is the number of successes (0 or 1) on the first trial, X 2 is the 
number (again 0 or 1) on the second trial, and so forth For each i = 1, 
2, , r we have 

P(X, = 1) = p and P(X, = 0) = g = 1 - p 

Hence 

E(X,) =1 p + 0 q — p 
and 

E{N t ) = E(Xi) + E(X 2 ) + + E(X r ) 

- P + P + + P = rp ► 

Example 2 For the one-element learning model discussed in Example 2, 
Section 18 6, let us define the random variables 

( 0 if a correct response occurs on trial n 

1 if an error occurs on trial n 

Then if E = the number of errors expected during learning is 

E(E)=e(±E,^ = ±E(E„) 

= D= i(l-l)(l-p)-‘ 

- a - 0/01 

p 

Example 3 The control unit of an automated assembly line contains a 
large number N of electronic circuit boards The engineering staff is faced 
with the decision of whether to 

(i) replace the circuit boards only as they fail 

(n) remove and replace all boards after T time periods whether they have 
failed or not 

The costs involved in their decision are 

C, cost per board of removal and replacement at some specified time 
period 

C 2 cost per board of replacement when failure occurs during operation 
598 / DISCRETE RANDOM VARIABLES 



Unfortunately it is not always the case that the variance (see Equation 
(10)) of a sum of random variables is the sum of the individual variances 
In fact, by definition, 

Var (X + Y) = E([X + Y - E(X + T)] 2 ) 

= E({[X - £(*)) + [Y - £(y)]} 2 ) 

= E([X - £(X)] 2 ) + 2 E([X - E(X)]\Y - £(7)]) 

+ E([Y - £(T)] 2 ) 

The first and third terms represent the respective variances of X and Y but 
the middle term spoils the simplicity of our formula The quanti y 

E([X - E(X)]\Y - E(Y)]) 

is called the covariance of X and Y and is written Cov (. X , Y) We have 
shown that 

Var {X + Y) = Var (V) + Var (Y) + 2 Cov ( X , Y) 


The covariance may also be written as 

Cov (X, Y) = £((* - M.v)( Y ~ hn)> 

= 210- - "rW* -J.Y-V 

where n\ = £(AQ and Mi = The terms ^mh negative Otherwise 

0 - M\) and (k - py) are either both positive ^ numb er It will 

the terms are negative Thus the covariance ca y on the same 

he positive when X and Y tend simultaneously nn nosite tendency 

side of their respective means, and negative when the opposite y 

1S predominant 


Example 4 A fair com is tossed three “‘J^ofteads on the 

number of heads on the first toss and Y t probability 

three tosses Either from their description or rom them n r 
table we might expect these random variables to have positive ^ 

Actual computation yields Cov (X, Y) — 5 


Problems 

4. Verify the computation in Example 4 


, . r»f the covariance may be 

A more convenient form for computa 
°htamed by applying Equation (15) Since 

wv n _ x'Y- X + 

(A - /x v )C > - P>) - A 1 


have 


Cov(.V.r)=£(A-n-fv«n-,.«A ) + Pvn, 

S OF RANDOM VAHIAULES / DO, 


19 5 s U M 



that fail are replaced at an expected cost of C 2 Np(T — I) if it is assumed that 
all boards which fail at trial Tare replaced at cost C i as part of the preventive 
maintenance program 

The expected maintenance cost for a given T is therefore 
E(T) = NC\ + C 2 Np(T - 1) 

Since E(T) is monotomcally increasing with T, the choice of a maintenance 
policy is based on the expected per period cost E(T)/T The decision criterion 
is that preventive maintenance should be undertaken at the end of any 
period for which the expected cost of replacing the circuit boards that fail 
during the period exceeds the average per period replacement cost up to and 
including that period That is, preventive maintenance should be taken at the 
completion of trial T if 

c 2 n p > nc ' + c ^ t - 1) 

> C..N,, - | (C, ; . - C,) 

Hence, C 2 Np > E(T)/T whenever C 2 p > C\ Note that the difference 
C 2 Np — E{T)/T is maximized when T - 1 
In other words, the strategy should be to replace after every trial if the 
expected per unit cost of replacement of a board which fails when in use is 
equal to or greater than the per unit cost of preventive maintenance Other- 
wise it is more economical to replace only when a failure occurs ^ 

PROBLEMS 

1 Suppose X and Y have the following joint probability table 


Y 



2 

3 

ti 

>< 

n 

1 01 

03 

02 

06 

2 02 

01 

01 

04 

P(Y-y) 03 

04 

03 

1 


Find E(X) E(Y) E(X + Y) and E(X Y ) using Theorem 4 ( Hint 
Note that E(X F) is not equal to E(X) E(Y )) 

2 Suppose Xi t X 2 X„ have common mean m Prove that the average 
X = (l/«)(A"i -f + X K ) also has mean m 

3 Let N, be the number of black balls obtained in r draws without replace 
ment from an urn containing b black balls and g green balls (r < b + g) 
Show that E(N r ) - rp where p = 6/(6 + g) ( Hint Proceed as m 
Example 1 and note that this is a special Polya urn ) 


600 / DISCRETE RANDOM VARIABLES 



as “failures” may be viewed as a special kind of Polya urn. If we define 
random variables X\, X 2 , ■ ■ ■ , X„ by 

j 1 if the Mh draw results in “success” fc = 

(O otherwise 

then our previous results (see Problem 12, Section 18.6) imply 

g 

E(X i) = P(.X, = 1) = Jj 


v , n S(N - S) 

Var (X/l) jjo 


Also if k 9* j, then 

E(xa,) = P(*i = 1, JK> - 1) - KXl = IW - 1 I Xi - = ‘) 

= S s-l 

so that 


AT IV - 1 

Cov (X k , X ,) = E(X l X,) - E(X l )E(X,) 


S S - 1 
N N - 1 


S(N - S) 
NHN - 1) 


Since Var (X k ) and Cov (X„ XO are the same for all k and j, it follows 

from (16) that „ 

S(N - S) n( n - 52 

Var (A", + A3, + • ■ • + X„) = n NKff- D 

N - a 

= m tt=~i 

, .. rrvimnare with Problem 

'‘here p = S/JV and q = (N - «)/* = 7) serves as a correction factor 
>6, Section 19.3.) The factor (AT - «)/( w ~ ■) s " , mcnt . ► 

for the variance when the sample is drawn wit 10 

c „ „r Bernoulli trials the payoff on 

Example 6 Suppose that in a sequcnc s tl]C total payoff on n 

’nal k is Sfc for heads and SO for tails. I f T 
•dais and A', the number of heads on trial k, we 


22 kXi 


Hence 


E(T) = e(± AXi) = ± kE<.™ “ j, A/> " P 


«(« HrJ) 


9.5 SUMS O 


r panoom VAR.Antts / eoa 



But nx = E(X) and = £(50 so 

Cov (X, Y) = E(XY) - E(X) E(Y ) 

In this form it is apparent that the covariance is zero, and thus the variance 
of the sum X + Y is the sum of the individual variances, if and only if the 
expectation E(XY) of the product XY is equal to the product E(X)E( Y) of 
the respective expectations In the very important case of independent 
random variables, this relation holds For if X and Y are independent, then 

P(X = j, Y = k) = P(X = j) P(Y = k) 
and 

E(XY) = EZ jkP(.X = J,y=k) 

3 fc 

= T,JP(X= j)Y.kP(x = k) 

3 k 

Summing first on k and then on j yields 

E(XY) = E(X) E(Y) 

In computing the variance of the sum of any number of random variables, 
we find 

Var (*! + .Vj + + X„) = Y, Var (*,) + 2 £ Cov {X„ X k ) O®) 

1-1 ]<k 

For example, 

Var (*, + X 2 + X 2 + X<) 

= Var (X,) + Var (X 2 ) + Var (X s ) + Var (X 4 ) 

+ 2 Cov (X„ X 2 ) + 2 Cov (Xr, X,) + 2 Cov (X„ X t ) 

+ 2 Cov <X 2 , X 3 ) + 2 Cov (X 2 , X,) + 2 Cov ( X 3 , Xt) 

If the random vartables V,, x 2 , , x„ are independent, then all co- 

variances are zero In this case (16) becomes 

Var (AT, + + X„) = Var (X.) + + Var (X n ) (17) 

The variance of the sum of independent random variables is the sum of the 
individual variances 

PROBLEMS 

5 Derive Formula (16) 

Example 5 Sampling without replacement from a population of size 
consisting of S elements designated as “successes” and the remaining N - 

602 / DISCRETE RANDOM VARIABLES 


CO ^ 



7. (a) Let Z=X-Y Write E(Z) and Var (Z) m terms of the means, 
variances, and covariance of X and I 
(b) Show that Cov (Z, X) = Var X — Co\ (AT, Y) 

8 If S T denotes the proportion of successes in r Bernoulli trials with prob 
ability p of success, show that 

Cov (S r , 1 - S r ) = -pq/r 

9 Prove that the H measure of information (Problem 16 Section 19 4) has 

the property 

mXY) < H(X) + H{Y) 

(Hmt Make use of the inequality t - 1 > !n '• ' > °’ and lel ' 
p Pi! P d 

10 As in Problem 8, Section 18 6, let the probabilit> of an error on trn n 

q n = ctq n —l (0 < « < » 

Let A ,, A-,, be mutually independent ^ 

assumes the value 1 when an error occurs on tna n 

(a) Verify that E(X„) = <* n qo 

(b) Compute the mean and variance of T — 1 n 

11 (a) Let A' be a random variable with n sthe vahiKOalld 1 with 
Let Y be a binomial random variable taking the values 
respective probabilities 1 — P anc * P Show a 


P(*. Y) = [EjX 1 Y - 1) - r ’ 


0 )] 


(b) Prove that — 1 < p(X, Y) < 1 

12 Let A and Y be any two discrete random ' an * + , s non 

(a) Show that, for any real number c, g(c) 

(b) Using' the fact that g (c) is a quadratic function o. c fa parabola), 
argue that we must have 

[E(A1-)1 2 < EO"> 

CTbis is the so-called Cauch>-Schvvare mcqu > on j, ,f ) ,s 

(c) Prove that -1 < p(A, 10 <> and that p( ’ 

a linear function of A O O” * 

t3 In a sequence of r multinomial tnals with po«ib . -^urrences of out- 

lv> X.(1 - 1.2, .m) denote variables L. 

com e O, Find Cov (\„ \,) (»«' Dj j n h lh3 , A , “ • > f trnl A 
(where i « 1,2, ,m, A - 1» 2 » ,r) * 

tevuhv m outcome E, and Ai B 0, 01lKr "'* |( _ dtnolin! . thc num 

14 Interpret the negative binomial random c f Bernoulli truU a' 

of failures before the nth success in a seq • 

„s SUMS Of RANOOM VAHtACtfS / «« 



and 


Var C r ) = 52 Var (***) = Z Var (*0 
*=i 1-1 


► 


The quantity 


P(A, y> = 


Cov ( X , Y) 
\/Var(X) Var (7) 


is called the correlation between the random variables A* and y When X 
and Y are independent, p(X, T) = 0 since in this case Cov ( X , 7) = 0 
It can be shown (sec Problem 12) that 


-1 < p(X y Y ) < 1 

and tint the extreme values —1 and 1 are assumed only when one of the 
random variables is a linear function of the other When p(X, Y) — 0, X 
and Y arc said to be wicorrelated 


Example 7 Suppose that random variables X and Y have the following 
joint probability table 


r\^ 

1 } 0 

P(Y = y) 

i 

\ l A 

i 

0 

AAA 

1 

P{X « X) 

1! ft A 



It is obvious from inspection that X and Y arc dependent since, for 
cvimplc /*( A “1, } = However, it can be verified that 

£<> \) - J £(1) «= A,and£(A) = § so that 

Cov (A. }) = £(A1) - £(A) E(Y) 

= i-i 3=0 

and the v in iblcs arc uncorrchtcd From this, uc conclude tint while the 
condition is zero if the v in iblcs arc independent, the converse is not 
necessarily true ^ 


problems 

6 Show b> induction ihit for an> positive integer n, 

/(\l- \ 2 - - \ .> - / ( X |) - , m;)+ +F(\.) 

ecu / discrete random variables 



of N individuals the reliability of the test is defined as 
(m, - »0 2 


= £(E- > I*. ~ *> 2 ) 


where 


i N 

= N^,‘ 


and 


i N 


(a) Show that, assuming individual scores are independent random 
variables, p may be rewritten as 


P = 


ZL, (m. - 7»r 


: E-i (»'• - “> 2 + (JV _ 1)<r 

Hence show that 0 < p < 1, and that reliability decreases as the 
error variance increases _ A _ n . 

(b) The simplifying (but unrealistic) assumption that Ei J «• - 
often made. Show that in this case, 


P = 


yi?Li (/». ~ w)“ 


(in, - m) + ^ 

22. (a) Let T, and T 2 be independent geometric random variables, 
necessarily having the same probability function Le 
Z = min {7*i, T 2 ) 

„ rr j y Prove that 2 itself is a geometric random 

be the smaller of Ti and T 2 Fro success and failure in Ber- 

variable Interpret this result in term of » * and r s > A ) 

nou.li trials (H.n, Z > A ^ - yd ^ ^ ^ „ pcc , cd 

<b) “.7= r..fof tr^metr’iJ ranJom variables ,s to 

the smaller of the expected values 

(c) Extend the result ,n (b) .0 any two random variables (««* 

always have Z < T, and Z < 2 a SUCCCS s.on of stages 

(d) Consider a project which mustbero pM ^ ^ indcpcnde nlly on 

A number of individuals may * n lcted'’ as soon as any one 

a particular stage, that stage ^nB P Accordmg to the PERT 

of the individuals completes to* tcchniqui: . the expected 
(Program Evaluation T“ h of the smallest of the expected 

total completion time is th lhc %an ous stages In mcw 

\alucs of individual completion prRT should tend to our* 

or (a)-(c) above, however, explain why 1 CK 
estimate completion times 


19 6 


•- GENERATING FUNCTIONS ^ 

1 "us section xvc sh ill briefly consider ways '^andm!' terminology. the 
^ to solve problems in probabihiy. 0 ou f unc t t on of its sequence 

“"er scries 2:r_o will be called the gemratmCJ 

a roMCTiows / 


,96 CENCHATIN 



the sum of n independent geometric random variables Then use For- 
mulas (15) and (17) to find the mean and variance of W n 

15. Let X and Y be independent, identically distributed random variables 
Define a random variable V — X + Y and a random variable V — 
X — Y Prove that the correlation between U and V is zero 

16. Prove that if X and Y are binary random variables, that is, assume only 
two values, then if p(X, Y) = 0, the variables are independent 

17. Faced with a pressing problem, an organization considers two options 

(a) Assign the problem to a single individual 

(b) Assign the problem to a group of k individuals 

The nature of the problem is such that it consists of three stages which 
must be solved in sequence Assume that individuals work independently 
and that there is probability 3 that any individual to whom the problem 
is assigned will, within a unit time period, achieve a satisfactory solution 
to the problem stage then under consideration Assume further that a 
group of k individuals solves a stage of the problem if and only if at 
least one member arrives at a solution of that stage 
If each consultant costs the organization $50 per unit time, which 
option should be chosen to minimize expected cost 9 ( Hint With k 

individuals, let T,, be the time required for the ith individual to solve 
the yth stage, let S, - min {7T„ T?.,, , 7 », ) be the solution time for 

stagey, and let S =» Si + S 2 + Si be the total solution time ) 

18. Suppose we modify Problem 17 slightly by assuming the consultants 
each cost $50 for each unit of time in which the problem is not solved 
What option should now be chosen to minimize expected cost 9 

19. A more realistic assumption concerning Problem 17 is that not only does 
each consultant cost $50 per unit time (including the time in which a 
solution is reached) but that, due to the possibility that a competitor 
will obtain the solution first, there is an additional cost for undue delay 
This time, solve the problem under the assumption that if 5 units of time 
are required, then the cost is 


C = $50 [kS + 2 s ] 

20. Prove that if Xi, X 2 , , Xy are independent random variables, each 

having mean p and variance a\ and if X = (1/A)££i X u then 


e(f)(A = 


(A - \)a 


21. The basic model of psychometric test-score theory* assumes that the 
lest score X, for the ith individual may be written 

AT, = m , -f- «, 

tshere m, is that individual's true score and <, ,s a random error variable 
having expected value £(«,) _ 0 and Var (.,) _ For a population 

‘ivtec™ iKi'r ,hcory t ‘T'*- > P . Utoto* 

tMcOraw-HilI Dook Company, New York, 1954), 2nd ed 


606 / DISCRETE RANDOM VARIABLES 



PROBLEM S 


1. Show that for a uniform random variable Y with P(Y - k) - \/N for 
k = i f 2 , N, the respective generating functions are 


Mv) 


and mrfa) 1 


— I_I 
N 1 


Table 8 lists generating functions of the most important discrete random 
variables. The primary usefulness of such a table lies in t e ac a " 
only does any random variable have a uniquely defined genera ing unc i , 
but conversely, the uniqueness of coefficients in a power series mea J^ 
any particular generating function corresponds to exact y one pro 
function. 


table 8 


Type of 
random 
variable 

Probability 

function 

probability 

generating 

function 

Moment 

generating 

function 

Binomial 

P(N r = *) = ('Vl 1 - PY' k . 

(q + psY 

(q + pe‘Y 


k =* 0 1, .... r 



Poisson 0 

e -x X fc 

P {X=k) = v . 




* = 0.1,2.... 







p 

Geometric I 

P{X = k) =pq k , 

* = 0,1,2.... 

1 ^~QS 

1 - qe * 

Negative 

binomial 

p(x = *) “(' + * _1 ) pr ‘ ? ‘ 

* = 0.1.2.... 

(rJsy 




1 

1 1-." n+ ” 

Uniform 

k = 0.1,.... n 

JT+T 1 - s 

n+1 1 -• 


Section 21.1. 


Either the probability generating function or .the j' n particular, 
!° n ma V be used to find various moments o a rat evaluated at s It 

lhc ^t derivative of the probability generating function. 


19.0 G 


£N E RATING 


FUNCTIO 


NS / «« 



of coefficients Recall (Section 16 2, Problem 22) that if f(s) = Ei«o «r s 
converges in some interval about zero, the coefficients are uniquely determined 
by u* = /' l> (0)/A' Conversely, n particular sequence (a 0 , a,, ) of 

coefficients obviously determines a unique function f(s) - Zjk=o a ^ 5 
within the interval of convergence of this series 

Exomplo 1 (a) If a k = 1 for each k, the generating function is 

s ( j ) = it 1 s k = t~ 

*-o 1 5 

(b) If N f is the number of successes m r Bernoulli trials, the generating 
function of the sequence of probabilities P(N r = k ) is 

S\,(s) = £ = k)s k = £ ( f) P l <l r ~ k s l *= (q + ps) r 

t— 0 1=0 W 

(c) The function generates the numbers 

-X, Jr 

ak = -jr~ A = 0,1,2, ► 


If A is a random variable having non negative integer values 0, I, 2, . 

the probability generating function of X is the expected value 

Cv(s) = E(s') = E />(* = A)* 1 (18) 

1.0 

The closcl> related moment generating function is defined by 

m\(,) = £(<•') = E P(X = A )e“- d 9 > 

1-0 

Ob\iousl>, the moment generating function is obtained from the probability 
generating function by replacing 5 by e* That is m x (y) = g x (e*) 

Example 2 (i) The results of Example 1(b) indicate that the probability 
gener iting function for the number of successes in r Bernoulli trials is 

K v,(0 = £(,'-) = E (;) |>V- V - (? + psY 

Tlie corresponding moment generating function is /n Nf (s) = (q 4- pe t ) r 

(b) The probability generating function for a geometric random variable 
\ with P( \ » A) “ pq* is 

Sv(*) « £(*') - ± P9 V - 

4-o I — q* 

T1 c corresponding moment gener iting function is m Y (s) «=■ />/( 1 — qc') ► 


«ca / DISCRETE RANDOM VARIABLE 



Hence, 


E(Nr) = rp 


and 

Var (Af r ) = 1 0 - 1 )P 2 + 'P- ' V 
= ip - ip 2 = ip( 1 - P) 




PROBLEMS 

2 Obtain the above results us.ng the moment general, ng fund, on 

If X and Y are independent random variables, the probability generating 
function of the sum Z = X + Y is 


g z(s) = E(s z ) — E(s , - 

The independence of X and Y implies independence of s and i an we n 

g2 (s) = E(s x ) E(s r ) = gxM fnW 

The probability generating function of the sum ' ‘^“n^nglbnctions 

variables is the product of the individual probabi y g generating 

Obviously, the same considerations apply as we variables That is, 
functions and also to sums of more than two ra probability 

■f X„ X it , Xn are independent random variables, then pr 

(moment) generating function of the sum 

Z = X, + X 2 + + x - 

>S the product of the probability (moment) generating functions o 
individual random variables X\, > % n 

Example 4 Let X , denote the number of succes “ s ( °.°J ° "he ' 'common 
»f a Bernoulli sequence The X, are independent and 
probability generating function 

Hence the generating function of N r X i -r T- r 

, , v X jrx W = (? + P s > 

gx ,M = g\ ,M X gx ,W X X gx ,t f ^ 

' c nf>mg the result of Example 2 

Example 5 The number T„ of failures before the nth success 
°f Bernoulli trials can be written 

A„ X 2 X„ arc independent ^Ts, bft»«n the 

r «pccii\cl>, the numbers of failures be ore 

/ 


,\+i 


) = E(s x s ' ) 


36 CCNE nAT, N c roNCTions 


/ 611 



gives the expected value For, 

f/ELn P(X = k)s l 

«VC») = * 

y rjnsfde (he interval of convergence ] 

h * 

1=0 

If the interval of convergence includes the number 1, then 
gx(D = E kP(X =k)= E{X) 

kwm 0 

Similarly, 

™\b) = £ t nx = ky L 

as i=n 


= £ P(X = k)ke‘ l 

l 0 

so 

#my( 0) - £(*) 

The first derivative of the moment generating function, evaluated at zero, is 
E(X) 

Further differentiation leads to the following general result 

Theorem 5 The nth derivative of the moment generating function m\(s), 
evaluated at s = 0, gives the nth moment of the random variable X That is, 

E(X n ) ~ m l \\0) 

The nth derivative of the probability generating function g,\(s), evaluated 
at s = 1 , gives the nth factorial moment of X That is, 

E(X(X - 1) (X - n + 1)) = g'i\ 1) ► 

As a particular application of this theorem, we have 

Var (X) = E(X 2 ) - [ E(X )] 2 = w\'(0) - [mi(0)f 
and 

Var (JO = EIX'(X - 1)] + E(X) - [E(JQ] 2 = gS(l) + gi(D - [sirCDl” 


Example 3 For the binomial random variable N rt gt> ,(s) = (q + P$) T 
Differentiating, we find 

*UD = rp(l + = rp 


sV r (0 *= r(r - 1 )p 2 (q+ ps)’ 2 |._, = r(r - 1 )p" 


610 / DISCRETE RANDOM VARIABLES 



Now U(s) = X)r=o »rJ r a nd 

+_i— -J[=\±f+\±<a-pyf 

1 — (9 — 2 r=D Z r-0 

We simply equate coefficients of s r to obtain 

i i r = i[l + (<7 ~ /’) r ] 


if 1 
2 Ll - . 


► 


PROBLEMS 


EMS 

3 Use the probability generating function to compute expected value and 
variance for the random variables listed in Table b 

4 Verify your results in Problem 3 using the moment generating functio 

5 A test consists of « items Each ‘tem has ^alterni ^ score T a 

sst 

*1,0*1 II ,b, ..bieil -d-* ■*•'* “ 
show that the generating function of T is 


^ (s > = 


Find E(T) and Var (T) 

v j y define the conditional generating 
i For discrete random variables A and / , 
function of Y, given X — A, by 

g.late « = £t/ I X = « = Z* p <- Y = r 1 A “ k) 

. rr/vn | v — L') can be obtained 
Show how the conditional moments £( I 
from this generating function 

7 Suppose that we have a population that is , he onset of 

is, decreasing in size) as, for examp e, people from a dinger 

shock, dissipation of a f rowd ' ^'tch a popubnon of mi.nl «e AT 
area Let us regard each member o ^ -, 0 x, 0 f d>ing (leasing) in the 
as a Bernoulli trial with probability some ihing happens so that 

interval (0, to) Now suppose that at time n, to , _ 

the probability of dying in the interva » ^ that N(0 possesses 

If N(r) is the number still alive at time /(( > M ,-..t», I 

a binomial distribution with parame e 

Wml Use the condiuonal generating function, 

, 1 \ yl, fine the joint, or piioriair, 

8 For discrete random varnblcs X an 

generating function by . , n 

»„ «, fund ion .1 1 *> 

/ «= 1 in g\ i . , j , s 0 bt lined h> setting 

(b) Shovs that the gunurating function o 

OENtBAT,- rUttCT.OttS / US 



first and second successes, , between the (n - l)st and nth 
Since (Example 2) the generating function of each X, is gxM> ~ P/V Vh 

the generating function of T n is 


gr«(s) 


verifying line 4 of Table 8 


-(t^X 


Generating functions are often useful in solving recurrence relations or 
difference equations The following is a typical example in probability 


Example 6 What is the probability that the number N r of successes in 
r Bernoulli trials is even, zero being considered an even number*? 

The solution is as follows The event ( [N r even) can occur in two mutually 
exclusive ways 

(i) N r ~i even and failure on trial r 
(u) N r - 1 odd and success on trial r 
Hence, denoting P(N r even) by i/ r , we have, when r > 1, 
u r = P(N r even) 

= P(.N r -\ even and F r ) + P(N r - 1 odd and S r ) 

= P(N r - 1 even) P(F r ) + P(N r —i odd) P(S r ) ['"dependence of trials] 
= u r _ i q + (1 - u r -i) p 
= P + (q - p)u r -i 
Of course r/ 0 = 1 

Introducing the generating function U(s) — u rS r we find 

U( S ) “ 12 «r S r = u 0 + 12 U rS r 

r=0 r = 1 

= 1 + i [p + (q — p)u r -i]s T 

r — 1 

= 1 + P 12 s r + (q — p)s J2 «r_l s r_1 

r=l r — 1 

= 1 + T^l + (9 ~ P)sU(s) 

Solungfor U(s) yields 

U(s) 1 ~ V s 

(l — s)[\ ~ (q — p)*] 

or, as a partial fraction expansion, 

612 / DISCRETE RANDOM VARIABLES 



MARKOV CHAINS 


2 ® 


201 BASIC CONCEPTS 

Probability models involving independent trials are not aI '™^ , P j| Robing 
to behavioral processes In this chapter we — ^ are 

dependent trials which has found wide app jca i , preceding trial 

simple Each trial is assumed to depend on l » e rcm otc trial 

but, when the results of that trial are known, not on y 
^ typical situation is the following case 

Example 1 The Random Walk A particle ''"“sLd^'lTlt 

(sec Figure 1) The particle starts at the origin a * j jp lt falls tails 

falls heads (probability p), the particle moves to ^ ^ lS con stitutcs the 

(Probability q = 1 - p), the particle movcS l ° f , n At whatever point 

r >rst trial and further trials proceed in a simila samc com is tossed 

the particle may have arris cd after the first t tria » ^ t j lc | c ft on the 

^nd the particle is moved either one unit to t ic r ^ com plctcl> dc- 

( f + Ost trial Obviously its position after , cr / tr nls Inform i- 

l «nuncd by the outcome of trial / + 1 and its position after ^ 

hon about preceding trials is superfluous 


* * -v -2 -1 0 


3 1 BASIC CO 


nccpts / 



(c) Show that E(X) may be obtained by taking the partial derivative of 
y with respect to s and setting s = t = 1 
(6) Stow tow tto moments EOT, E(*n, and 7) may be obtained 
from gx y 

9. Let X be a random variable with generating function k(j) Find the 
generating functions of X — 1 and of 3X 

10. Let X\ and Xi be independent binomial random variables with 

„, = « = (";)/,■'-* 

Use generating functions to prove that X\ + X% has the binomial prob 
ability function 

11. Given the probability generating function 

gx(s) = 1 - (1 — s ) 
compute E{X) Interpret your results 

12. Let V, denote the probability that the number of successes in r Bernoulli 
trials is divisible by three (zero is divisible by 3) 

(a) Find Vo, Vu V 2 , and V% 

(b) Give a probabilistic argument to establish the validity of the formula 


W+£(*),.V-V,_. r > 3 

(fi\ lcaccidu.ce. dee. {wwcVic>ti ~ Y,”-o V r X v?ti> uvi the 

results of (a) and (b) to show that 


V{s) ; 


(1 - 


(1 - qsV ~ (pi) 3 

13 Expand K(s) in Problem 12 in a partial fraction expansion and argue that 
lim r -.* V r = regardless of the value of p Interpret this result 


SUPPLEMENTARY READING 


Feller, W , An Introduction to Probability Theory and its Applications (John Wiley 
& Sons, Inc , New York, 1968), Vol 1, 3rd ed Chapters 6, 9, and 1 1 

Goldberg, S , Probability, An Introduction (Prentice-Hall, Englewood ChfTs, N J , 
1960), Chapters 4 and 5 

Panten, E , Modern Probability Theory and us Applications (John Wiley & Sons, 
Inc , New York 1960), Chapters 7 and 8 


614 / DISCRETE RANDOM VARIABLES 



detergents in one week, 67% purchased only detergents during the following 
week — an indication of detergent loyalty. Similar evidence of brand loyalty 
for the other purchase behavior patterns may be noted, leading to the con- 
clusion that purchases from week to week are not independent. p- 


TABLE 1. Transition matrix for laundry purchases 
Purchased the following week 




Detergent 

Soap 

powder 

Both 

powders 

No 

powders 


Detergent 

0 67 

0 09 

0 02 

0 22 

Purchased 

Soap powder 

0 04 

0 72 

0 04 

0 20 

one week 

Both powders 

0.12 

0 24 

0 52 

012 


No powder 

015 

0 26 

0 02 

0 57 


In a Markov chain, we assume there is a collection of 

The event that the system ,s m state k at trial n is denoted by SM 

and its probability P(S k (n)) by tM The «ctor 

7r(/l) = (r t (ll), v 2 OO, ■ ■ •) 

contains the probabilities of being in the various stales at " [ ^ e 

Since at any trial we must be in some state and cannot be n more than one, 
‘he probabilities in the vector r(n) must sum to unity, that is, 

2 “ 1 
f. 

The probability of transition from state S, to ^ n ^ al t [J e process 
ditional probability of being in state Si on one n , ^ the tran sition 

^ in state on the preceding trial) is denoted > pji 
hiatrix is 


From state 




To state 

/ 

5, 

S 2 

Sz 

Si 

P 13 

P 12 

p\Z 

So 

P2\ 

1 Te: 

Ps 3 

•S 3 

Pz\ 

1 P*- 

P 33 




--■wing from a state vve must go to some sta ' c ’.° 

Cl ' row of the transition matrix P sum lo uniiv • 

Em ' 1 for c3Ch J 

k 

BASIC cowcc 


lhat the elements in 


20 1 


PTS / c17 



Example 1 contains the essential ingredients of a Markov * chain which, 
generally speaking, constitutes the most important and widely used prob- 
ability model for the behavioral sciences A Markov chain is described by 
listing the states of the chain, the initial probabilities of being in the various 
st ites and the probabilities of transition from one state to another In Example 
1, the states are the possible positions 0, dtl, ±2, of the particle The 
process begins in state 0 Denoting the initial probability of being in state 
k by 7rt(0), we have 

ir 0 (0) = 1 and 7r*(0) = 0 if k ^ 0 

From a given state k , it is possible to move in one step (or trial) only to state 
k + 1 or to state k — 1, with respective probabilities p and q All other 
(one-step) transition probabilities are zero 


Example 2 In a T-maze, a rat may turn left and obtain food or turn 
right and receive a mild electric shock Let us suppose that initially (on trial 
number zero) he is equally likely to turn left or right Having obtained food 
on one trial, his probabilities of turning left and right become 0 7 and 0 3, 
respectively, on the following trial Having received the shock on a given 
trial, his new probabilities of going left or right become 0 8 and 0 2 
We thus have a Markov chain with two states R (the rat turns right) and 
L (the rat turns left) The initial probabilities of R and L are, respectively, 
ff/i(0) = \ and iri(0) = b The probability of transition from state R to 
state R (that is, the probability of turning right after having turned right on 
the preceding trial) is p n r = 0 2, the probability of transition from state 
R to state L is pn i = 0 8, and so on 

This information is conveniently summarized by writing the initial prob- 
ability vector 

*(0) = 0r*(0), irrXO)) = (I, \) 
and the transition matrix 



Example 3 Purchases of laundry cleaning powders over a 26 week period 
were studied by Styon and Smith t Assuming that the types of laundry 
powders purchased constituted states of a Markov chain, they obtained the 
transition matrix shown in Table 1 

The transition matrix indicates the extent of brand loyalty and brand 
switching taking place For example, of those households purchasing 

•After A A Markov one of ihc many outstanding Russian contributors to the theory of 
probability 

fStyon G P and Smith H ,* Markov Chains Applied to Marketing " Journal of Marketing 
Rest arch 1, 50-53 (1964) 


616 / MARKOV CHAINS 



FIGURE 2 



A state S L may be reached at trial 11 by going through one and only one of 
the states S u S 2 , on trial n - 1 This obsersation allows us to partition 
the e\ent S k (n) into 


S L (n) = U (SM - 1) fl Si(n)) 
) 

SO that 


Ti(n) = P(Si(n)) = E p ( s ><- n - 0 0 Si(n)) 

J 

= E - DVWn) I _ 0) 


= E ~j(" - *) p,L 


The term -r(u) is the Ath element in the \ector 
15 the !:th element m the matrix product "(« “~ 
hmcs the transition matrix P That is, for n = 


-(d) while E> T i( n OPji 
1) P, of the \ector -(« - 1) 
1, 2, 3, 


-(n) = -(«—') ^ 


(D 

>• follows that probabilities of being in ''-'"tusmon 
^ Witten in terms or the initial sector -(0) and the transit 
Us "i£(l),wefind 


tr(l) = r(0) P 

-( 2 ) = -( 1 ) P = r ( 0) F ~ 

- (3 ) = r(2) P = -(°) ^ 


T* h> a simple inductile argument, 

r(n) = r(0) I” 

.} (2) hold* 

v,c ''~!I always write P° = /, the identits matrix 
'*111,3, x,' 

W1 BASIC CONCtPTS / 


(2) 

for 


eis 



A mitrix with non-negative entnes whose rows sum to unity is called a 
stochastic matrix A vector whose entnes are non-negative and sum to unity 
is a probability vector 

The following is the distinguishing feature of a Markov chain 

Markov Property Suppose the system is in states s K ,s lv s,„ ,S,_„ 
S } , respectively, on trials 0, 1, 2, , n — 1 Then the conditional probability 

of state Si on trial n is the same as the conditional probability of Sk, given 
only that the process is in state S, on trial n — 1 Symbolically, 

p{s>in) i s Jo ( o) n s,,(i) n n - 2> n - 1» 

= P(S t (n) [ S,(n - 1)) = p,L ► 

Roughly speaking, if the state of the process is known at trial n — 1> 
knowledge of previous trials adds no probabilistic information All that 
matters is which state the process is in and not how it got there This feature 
is sometimes called the independence of path assumption 
It must not be supposed, however, that two events determined by different 
trials are necessarily independent if the trial numbers differ by 2 or more 
In the absence of information about intervening trials, the probability of 
occurrence of state k on trial n does depend on the occurrence of state j on 
trial m even if n — m > 1 The next example illustrates this point 


Example 4 In a study of the Samoan language Newman* has found that 
the sequence of vowels V and consonants C is adequately described as a 
Markov chain with 


Pw = 0 51 p iC s 0 49 


Pc i =1 pcc ~ 0 


Thus consonants never follow consonants in written Samoan, while the 
probabilities are approximately equal that the letter following a vowel is a 
vowel or a consonant 


For purposes of illustration let us arbitrarily assume that for the first 
letter the respective probabilities of V and C are § and 5 Then it is easily 
verified from Figure 2 that the probability is 0 67 that the third letter is a 
vovvel, while the conditional probability of this event, given that the first 
letter is a vowel, is 0 75 Thus, the events V, and V 3 are not independent 
Note that we have assumed nothing about the second letter ► 


PROBLEMS 

I. Verify the calculations in Example 4 


•Ncwnun L. B ‘The Patlcrn of Vowels and Consonants 
American Journal of Psychology 64, 369-379 (1951) Markovian 
other languages, including English arc discussed 


m Various Languages," 
properties of a number of 


618 / MARKOV CHAINS 



is a matrix with identical rows Thus, for large n, the probability that the 
nth letter is a vowel is approximately l, regardless of whether the mitia 


letter is a vowel or a consonant , ,, , „ 

Let X, (n) and x c (n) denote the respective probabilities that the nth letter 
,s a vowel or a consonant Then if x,(l) = P, Equation (2) shows that 


x(n) = (xv(li), ~c( n )) = Oh 1 “ P) P 

= [1 + K-i) n-2 + K-i)”- 1 . i - 

Hence no matter what the value of p, the effect of the initial vector wears off 
and we have w. 

lim tt(h) = (l» 3) 


problems 


4. Verify the calculations of P 2 and P 3 in Example 5 

5. Go through the inductive proofs leading lo Equations (2) and (3) 

6. A rat is put into the maze shown in Figure 3 ^VayTto leaved 

the compartments at random That is, i ere Write 

compartment, he has probability 1 A the compartment 

the transition matrix for this Markov chain using 

numbers as states of the chain 

7. Consider a system possessing an equilibrium ' s '“^ 0 “"game direction' 

displaced from equilibrium in either a p ’ , „ in CCO nomics 

Examples are innat.on-recession or expansion contr. 


FIGURE 3 



JOI 


BASIC 


COHCtPTS / 



PROBLEMS 

2 In Example 2, compute tt( 2), 7t( 3), and 7r(4) 

3 Show that if the rows of a transition matrix are all equal to the same 
vector R = (ri,r 2 ,rz, ,) then for n = 1, 2, 3, ,P n = P and 
it(k) - R 

In terms of individual elements, Equation (2) may be written 

n(j') = H 

J 

where p,k(ri) denotes the element in the (j, k ) position of P n In particular, if 
we begin in state S, so that ir r (0) = 1, the probability ir k (ri) of being in state 
at time n is 

n(«) = 1 Prl 00 

That is, the element p r k(n) in the (r, k ) position of P n may be interpreted as 
the probability of transition from state S T to state S k in n steps 


Example 5 Let us approximate the situation in Example 4 by adopting the 
transition matrix 

To 


Then 


V C 


P From 



P 2 = 



and 



) 


and, in general, a simple induction shows that 


'§ + *(-*)■ 5-J(-i) n \ 

v§ + K-S )- 1 


(3) 


From this general form we see that if we begin with a vowel, there is 
pro a i ity 3 + that the (/i -f l)st letter is a vowel and probability 

j ~ A-i) that the ( n + I)st letter is a consonant Similarly, the elements 
m the second row of P n give the respective probabilities that the nth letter 
following a consonant is a vowel or a consonant 
More interesting is the fact that 


A = lim P n 



620 / MARKOV CHAINS 



for the nomination, verify that the transition matrix is 


4> 

A 

B 

C 

AB 

AC 

BC 

ABC 


$ 

l: 


\i 


A 

0 

1 

0 

0 

5 _ 

24 

10 

T¥ 

0 

10 

T2 


B 

0 

0 

1 

0 


AB AC BC 
0 0 0 
0 0 
0 0 
0 0 
45- 0 


T2 


5 

T8 


T2 


ABC 

o 

o 

0 

0 

0 

YiJ 


11. (a) Show that the Polya urn scheme (Example 4 Section i 18 ! 6) » not « 

Markov chain tf the state of the system ts the color of the ball 
drawn . . nn A 

(b) However, if the state of the system llsts J he ° 

green balls in the urn, then we do have a Marko 

12. Diffusion of a technological innovation may be interpre^^^ ^ being 

state Markov chain At trial n, a potential use if we assume that 

either familiar F or unfamiliar V with the innovation 1 Nt >• u 

the transition U —* F occurs with probability p and F-* U witn p 
ability 0, 

(a) write the transition matrix P mntrix 

(b) write the «-step transition matrix P' directly wit ou 

multiplication . t 

13. If the process in Problem 12 is assumed t° start in state U, " at i 
probability that the system will be in state on ^ 

(a) first triaP (b) second trial 9 (c) " th lr,!1 

(Compare with Example 2, Section 18 6) . n O' 

14. Suppose we have multinomial trials with P° sslb ^ ° w hich P‘ = P for 
O, on each trial Show that this is a Markov cnam 

all n > 1 . . , „ 

* .need trainees In oroer 

15. A small consulting firm employs three nh ,| c a i the same 

to assure exposure to both oflice and , randomly selects one of the three 
time avoiding partiality, the company * ^ 0 n, C c or to the field 

trainees each week and assigns him cither , f a trainee is 

The initnl assignment is made ran om *' j- previous week 
selected, his assignment is changed ron ‘ unC h mged Consider 
If a trainee is not selected, hts assignment ren ^ ^ r , s lhc number 
as states of a Markov ch»«n the triplets t , ■ • aVk) gnctl to the othce. 
of umssigned trainees, ) the number o * , hl . transition matrix 
and z is the number assigned to the tie i 

CONCEPTS / 


201 


BASIC 



and hypoactivity-hyperactivity or elation-depression in psychology 
For simplicity let us assume a system 


£_ 2 E - 1 Eo Ei E 2 

with five possible states in which transitions from any state are restricted 
to adjacent states We thus have a finite random walk (compare with 
Example 1) with boundary states £_ 2 and £2 and internal states £-i» 
£ 0 , and £1 Write the transition matrix for the system in each of the 
following cases 

(a) From an internal state the system may move either left or right or 
remain in that state, with respective probabilities p, q , andr 

(P +q + r = 1) 

However, once a boundary state is reached, the system is “absorbed” 
and remains there forever 

(b) Same problem as (a) except that when the system reaches a bound- 
ary state, it is “reflected” to the adjacent internal state For instance, 
if the system enters £__ 2 , it moves to £_i in the next time period 

(c) From state £, (where j - —2, , 2) the system may move either to 

E,~i or to £ J+ i with respective probabilities (2 + j )/ 4 and (2 — j)/4 

8 After each successive transmission through the various levels of an 
organization, an order may be either in an unaltered state V or, due 
either to error or administrative reconsideration, in an altered state A 
Assume that p AU = 1/10 and p UA = 8/10 

(a) Write the transition matrix P 

(b) Construct a tree diagram for the first five stages 

(c) Compute the probability vectors 7 r(l), , tt(5) 

(d) Compute P 2 , , P 5 Note the trend in your results 

(e) Verify that the calculation for P J agrees with the results obtainable 
from the tree 


9 a [’ nval of cust °mers at a sales station is a Bernoulli process with 
probability p that at least one customer arrives during a given minute 
The sales clerk can complete a break only if no customers arrive in the 
next five minutes Define a Markov chain whose states are defined as the 
number of consecutive time periods without a customer arrival Find 
the transition matrix, assuming that the process terminates upon the 
occurrence of five successive time periods without a customer arrival 


10 Three candidates are contenders for nomination by the party convention 
Suppose that on each ballot candidate A holds his delegates with prob- 
abihty g candidate B holds his delegates with probability f, and can- 
didate C holds his delegates with probability J If a candidate loses his 
delegation on a particular ballot, he withdraws from contention On 
each ballot, all delegation decisions as to whether to hold or switch are 
made without knowledge of the decisions of the other delegations If the 
state of the system is defined as the set of candidates still m contention 


622 / MARKOV CHAINS 



Example 1 A simple Markov chain was used by Marshall and Goldhamer* 
to model the process leading to first admission to a mental hospital They 
assumed a process having the states 


5 0 = alive, sane 

51 = alive, insane (mild), unhospitalized 

5 2 = alive, insane (severe), unhospitalized 
Sz = insane, hospitalized 


S4 = dead 


and transition matrix 



So 

S, 

S 2 

S 3 

^4 

So 

P 00 

P 01 

P 02 

0 

P0i\ 

Si 

1 0 

P n 

0 

Pi 3 

P 14 


0 

0 

Piz 

PS3 

P 24 

Sa 

1 ° 

0 

0 

1 

0 

St 

\ 0 

0 

0 

0 

l 


According to the model there is no 
state nor from the state S4 Thus, states 
classes for the first admission process 
forms a separate open class 


jrn from the insane, hospitalized 
and Si form individual absorbing 
,ch of the states So, and ** 


Example 2 Let us simplify Problem 15 of Sechon M 1 ^^"suhmg 
t'No sales trainees All other conditions remain u 
Markov chain has six states with transition matrix 

(0,1,1) (0,2,0) (0,0,2) (1,1,°) (1,0,1) ' ) 


s i (0, 1, 1) 

( 0 , 2 , 0 ) 
s a (0, 0, 2) 

s i (1, 1,0) 

S; O.O, 1) 

S ' <2.0,0) 

cnil" A " • °nil Goldhamer It .“An Arrl'f''?™ 1 A l eme n Srjll.it u//<» 
Ir Jtm,oloE> or Mental Ducme." * 

• r ’ IN (1955) 


/ 0 

i 


0 

0 

* 

0 

0 


0 

0 

0 

* 

0 


the St uh 
xttt vi 


02 CLASSIFICATION OF TMC ST 


ATtS / 



20 2 CLASSIFICATION OF THE STATES 


It will be recalled from Chapter 3 that a relation E defined on a set A is 
called an equivalence relation if it possesses three properties 

( 1 ) The reflexive property— for every xeA, xEx 

(u) The symmetric property — for every pair of elements x and y in A, 
if xEy, then yEx 

(in) The transitive property — for any three elements x, y, and z in A, 
if xEy and yEz, then xEz 

The set A is partitioned by E into “equivalence classes” in the following 
sense For each x e A, let A* denote the set of all elements z e A for which 
xEz Then for any two elements s and t in A, either A, = A t or A, n A t = <f> 
The equivalence relation important for Markov chains is the relation of 
communication Two states 5, and S, are said to communicate (S t CS } ) if 
there exist non negative integers m and n such that p,,(n) and p,,(m) are both 
positive Since p u ( 0) — 1, each state communicates with itself Different 
states communicate when there is positive probability that each can be 
reached from the other in a finite number of steps 

PROBLEMS 

1 The last two statements in the preceding paragraph imply that com 
munication is reflexive and symmetric Complete the proof that com 
mumcation is an equivalence relation by establishing transitivity 

The communication class C s for a state S is the set of all states which 
communicate with S No communication class is empty since each state 
communicates with itself And, of course, since communication is an equiva- 
lence relation the communication classes of two states are equal if those states 
communicate and disjoint if they do not 
A communication class C from which no exit is possible is said to be 
closed Inlhis case p, k = 0 ir S, ls in C and S* is outside C It follows that if 
wc delete from the transition matrix for the chain all rows and columns not 
corresponding to states in C, we still have a stochastic matrix Restricting 
our attention to this matrix allows us to study the closed class independently 
of the rest of the chain A closed communication class which contains only 
state rtself 15 absorbmg and ,he same description is applied to the 

A communication class which ,s not closed ,s called open An open class 
O may be left, but never returned to For ,f return were possible, a state 
S in O would communicate with a state outside O In this case, S, and 
Sr would be members of the same communication class, an obvious contra- 
diction In leaving an open class, the process may pass either to another 
open class or to a closed class 


624 / MARKOV CHAINS 



FIGURE 5 



from states of the first open class to those of the first closed class, and so 
forth The symbol O indicates a zero matrix matrix 

Regardless of the number of open and closed classes, t e ra ^ 

Of a Markov chain can be partitioned m a manner simi ar n 

ease, we arrange the closed classes first, followed y e „ 

such a way that from any open class the process may mo 
Cass or a closed class listed above it Such an arrangement will be called 
the canonical fowl for the transition matrix 


Example 3 The canonical form for the transition matrix corresponding 
to Figure 5 is 



So 

s 3 

•S's 

Si 

s 4 

So 

s 2 

So 

Sr 

s 5 

1 1 

0 

0 

0 

0 

0 

0 



0 

— 

4 

S 3 

( 0 n 

0 

1 

0 

0 

0 

0 

0 

0 1 

S 8 

0 

1 

0 

0 

0 

0 

0 

0 

0 

s, 

0 

0 

0 

0 

i 

J 

3 

0 

0 

0 

s, 

0 

0 

0 

1 

0 

0 

0 

0 

0 

So 

0 

0 

0 

0 

X 

X 

0 

0 

0 

s* 

i 

X 

0 

0 

0 

i 

i 

o_ u 

0_ 

So 

[7 

0 

0 

0 

0 

0 

1 



s, 

\0 

0 

0 ! 0 

0 

0 

i 

a 

il 


CATION OF 


202 CLASSIFI 


THE STATES 


/ 627 



The various communication classes are easily recognized m the flow 
diagiam of Figure 4 Once the process leaves state S G , it never returns, so 
that Sq forms a single open communication class Since states S 5 and S 4 
communicate with each other but with no other states, these states form a 
separate open class The remaining three states form a single closed com- 
munication class 

Note that the process passes from open class {So} to open class {S 4 , 5s} 
and then eventually to closed class {^j, S 2 , S 3 } where it is trapped ► 


FIGURE 4 



PROBLEMS 

2 For the Markov chain whose flow diagram is shown m Figure 5, indicate 
the communication classes, determine for each class whether it is open 
or closed, and write the transition matrix 

If we have a Markov chain with, say, two closed classes and two open 
classes, it should be apparent from the above discussion that, after an 
appropriate relabeling of the states, the transition matrix for this chain may 
be written in partitioned form as 


( Pi O O 
O P 2 o 
A B Tl 

'C D £ 



(4) 


Here Pi and P 2 are stochastic matrices which contain probabilities of transi- 
tion within the separate closed classes, T x and T 2 contain transition prob 
abilities for the respective open classes, A gives probabilities of transition 


626 / MARKOV CHAINS 



FIGURE 6 



20 3 REGULAR CHAINS 

A Markov chain with transition matrix P is called regular if s0 ™f (e 

P contains only positive entries It is obvious that al s a es lass 

'n a regular chain so that the chain consists of a single “ imm f p a i s0 

Moreover, it is easily seen that if P n > 0. then a " higher powers ^ ^ 
contain only positive entries Thus it is possible to move from J 
any other state in any number of steps greater than or equa 

Example 1 The chains ,n Examples 2, 3, and 4 of Section 20 1 are regular 
Those in Examples 1, 2, and 3 of Section 20 2 are not 


problems 


Problems 3-7 arc regular and 


Multiplication shows tint 


L In Section 20 2, which of the chains in 
which are 001 ^ 

Example 2 In Example 5 of Section 20 1 

-(; 3 

an d U is obvious that the states communicate 

-(; I) 

. i .ill entries of P 

, all entries positive Indeed, we siw that r ° r eJ f 1 - 
c Positne and that the powers P n app«> ' ci ' thc " m 

L m /" ” (= ’,) ^ 

20 3 REGULAR CHAINS / «« 



PROBLEMS 

From the transition matrices indicated in Problems 3-7, draw a flow diagram, 
arrange the states in communication classes, and write a new transition 
matrix in canonical form In Problem 7, * denotes a positive element 


5 The transition matrix in Problem 10 of Section 20 1 

6 The transition matrix m Problem 15 of Section 20 1 

7 $1 52 53 54 55 5g 57 5g 5g 

51 /* 0 0 0 0 0 * 0 0\ 

52 / 0 * 0 * 0 0 0 0 o' 

53 00*00000* 

54 0*0*00000 

5„ 00*0*0000 

SG 0 * 000 ** 0 * 

57 loo* 000 * * 0 

s »\° 0 0 * * * 0 0 0 

50 \o 0* 00000*/ 

8 Draw a flow diagram for the Markov chain in Problem 9, Section 20 1 

9 (a) Write a transition matrix for a five state Markov chain with closed 

classes {ai, 53} and {52 54, 55} Denote nonzero entries by * 

(b) Partition the transition matrix according to communication classes 
and tell how to compute powers of this matrix 

(c) Why can the chain be treated as two separate chains 9 

10 Prove that every Markov chain with a finite number of states must have 
at least one closed communication class 

11 For the random walk of Example 1 of Section 20 1, prove that 

A (2 n + 1) = 0 for n = 0, 1, 2 

12. Prove that in a finite chain with r states, if 5, can be reached at all from 

it can be reached m r or fewer steps 

13. In the flow diagram in Figure 6 an arrow denotes a positive transition 
probability 

(a) Argue that every state communicates with every other 

(b) Argue that return to a given state is possible only in 4, 8, 12, 
steps, that is, in a number of steps which is a multiple of four The 
process is called periodic (with period 4) 


628 / MARKOV CHAINS 



Example 3 Roby* is concerned with decisions which affect income over a 
long period of time He assumes 

(i) At each time (trial) t, an individual may be in one of several environ- 
mental states Si, So, , S r , which constitute the stales of a Markov 
chain 

(n) A sojourn in state 5, at trial t leads to a specific income i ,(l) 

Let 11(0 = col (o,(t), o 2 (0, , »r(0) be the income vector at time t and 

t(|) = (r,(0, , ir,(0) be the futurity vector (Roby s terminolo^) 

probabilities of being in the various states at time t Then the decis o 
maker’s expected income at time t is 

*(0 u(t) = Z) ^«(0 y i(0 

1=1 

As an example, consider a stock which goes through states of growth S„ 
stability S 2 , and dividend S 3 subject to the transition matrix 

Si So S 3 

Si /i i °\ 

0 i 4 

0 1/ 

and an income vector v — (1,4, 3), constant over time 
According to Theorem 2, the limiting \ector 

lim ir(l) = O = (o,,«2,oa) 

*“** - n P Thus wc want 

15 the only probability vector satisfying the equation a 
a, = la, + 

ao = la, + 

a 3 = i“S + 

1 = a, + + “ 3 

- ( V. iV* i r) H cnec * 

^ unique solution to this set of equations is « ^ ’ lc j , n comc per 

U!ln S (6) and (7), the limiting value of the inves 

U *Ut ti mc 1S _ 

lim *(f) t(0 = « L " 

'^pendent of the initial \cclor r(0) 

^nmpto 4 Markov ch uns li l'c been used XTttrong assumptions 
a bbor mobility t However, the models require r 

Rtf '7 T , “Ultlilj and T uluril>," PrAvr-ir r ' Srtercr 7 I 0, 

Sivul MoWD. J 



(7) 


"P f « eumalc I 'ran S J . * Moiuivr 
* ^ S 1 18, 56-66 ( 19<5) 


• J \jtnJ t f P' 


:a3 stout*" C»*.H* / 



The behavior exhibited in Example 2 is typical of regular chains Theorem 
1 gives the details The proof is rather lengthy and will be omitted * 

Theorem 1 If P is a regular transition matrix then the sequence ( P n ) of 
powers of P converges to a stochastic matrix Q each of whose entries is 
positive and each of whose rows is the same probability vector ► 

The next theorem provides a simple method for computing the limiting 
form of P" 

Theorem 2 Let P be a regular r X r transition matrix and let Q = 
lim*.,* P n be the limiting matrix, each of whose rows is the vector v = 
(o 1 ,o 2 i , u r ) Then, 

(a) no matter what the initial vector tt( 0), the sequence 7r(l), 7t( 2), 
converges to v 

(b) v is the unique probability vector satisfying the equation 

v = vP 

PROOF (a) Since «•(/») = ir(0) P n , the fcth element of tt(/j) is 
**00 = £ *,(0 )p,n(n) 

J-l 

Since, by the preceding theorem, 

lim = Vi (5) 

holds for each j, it follows that 

lim irr(n) = £ Tj (0) = t> t 

r-i 

(b) Since />"+' = p - p , W e have 


Taking limits as n —> 


proving that 


Pain + 1) = 2 p„(n) p, L 

J=1 

cr and using (5) gives 

H VjPjL 
j=i 

v = tp 


If « were another probability vector such that u - uP, then a simple 
induction shows that .. - „P" Tor all n > 1 But, since „ ,s a probability 
vector, it follows from (a) that lim„_ x ,,P" = „ H cnce „ = „ ► 


•See Kcmeny J G and Sm.ll J L finite Market Chants fD 
Princeton, NJ , 1960) pp 70-71 


Van Nostrand Company, 


630 / MARKOV CHAINS 



PROBLEMS 


ILL IVI w 

2. Find the limiting transition matrix for Problem 6 of Section 20 1 

3. Find 


,n Problem 8 of Section 20 1 and compare with P‘> com- 

puled m that problem 


puled in that promem 

4. In Example 3 of Section 20 1, find the eventual market share for each 

type of laundry powder 

5. Suppose Q = l.m„ ^P\ where Pisa regular transition matrix Consider 
another Markov chain with transition matrix R , w 

an integer Prove that Q is also the limiting transition matrix for this 

new chain 

6. Show that the limiting vector for the general two state chain with Iran- 

sition matrix . 

/I - a « \ 

p ~{ 0 i-f>) 

is u = 03 /(« + », o/(« + ») (See Problem 25, Section 10 3 ) 

„ , . /i i u. s the limiting prob- 

7. Find a transition matrix P for which v ( 2 » 3> 6 

ability vector 

8. Due to its favorable location in a progressive sh ®P p “^ t “5y m^ease in 
Nichols Department Store has been e*™"^*^* growth 
sales Although the present store facilities are 

may require expansion or possibly constructs « * « share of 
management is attempting to preset t eff 1^ B_ n( fac| , lties cannot 
the market It has been determined tha million Annual sales of 

adequately handle annual sales in exces ^ facilities, and sales 

S15-S25 million would require expansio P ^ A markel research 
m excess of S25 million would require new o j- the cus tomers 

firm retained by the company has determin 3 purchase 

who made their last purchase at Nichols will make com . 

from a competitor, while \ of those cus L ase jf annual depart- 

petitor will switch to Nichols for their nex center are forecast 

ment store sales in millions of dollars at the shopping 
to increase according to the growth function 


40 


= 0 , 1 , 2 , 


for what expansion cl-nmali.y should management ixgin planning’ 
(Assume the market process is a Mar ov ‘ ^ cr|1 , ca | 0 r the 

9. Suppose that in Problem 8 some of that consumer buying 

findings of the market research firmonthcgr ^ ^ rcsu |,, the market 
habits were not considered in sufiicien purchases instead o 

research firm proposed to study three c opo5e d to estimate the 

two as ,n the mitial study For « ar " p ' i c ' purchase from Nichols, 

proportion of customers who will mac conwcu me purchases t ere 

given that they had made their last ensiled .and the firm » 

The previously unconvinced executive 


20 


3 REGULAR chains / «« 



that often are not tenable— for example, that the probability of changing 
industries is constant over time and does not depend on the worker’s prior 
history or work experience, that the system is closed in that no new workers 
enter, or that they enter only to replace those not otherwise accounted for, etc 
In order to obtain a more realistic model, Blumen, Kogan, and McCarty* 
divided the working population into two classes — “stayers” and “movers ” 
With probability 1, a stayer remains in his respective industry during the 
observation period (quarter) For a mover, the probability of transition from 
industry i to industry j is m X] (We allow the possibility that i = j) 

The only transitions take place among movers and for this segment of the 
population the transition matrix is M ~ ( m tJ ) The element in the 

(i,j) position of the nth power of M gives the probability that a mover goes 
from industry i to industry j in n quarters 
Let us denote by s % the proportion of stayers in industry i, so that 1 — 
is the proportion of movers Then the probability that a person selected at 
random from industry i at time zero will be in industry j at time n is 


p u (n) » 5 , + (1 - s t )m u (n ) 

P»j( w ) - 0 — s t )m l} (n) for i t* j (8) 

If we let S be a diagonal matrix with elements s t . Equations (8) imply that 
for n = 1, 2, , the "transition” matrix P(n ) = (p tJ (n)), for n steps, is 

P(n) = S+ (I - S)M n 

Note that P(n) is not [P(l)] n r so the process, strictly speaking, is not a 
Markov chain Markov chain techniques are applied only to the movers 
To take a specific case, consider a simple three-industry economy in which 

/09 0 0 \ /q 2 03 0 5\ 

S-Io 02 0 J and M=l01 06 03 J 

\o 0 04/ \01 05 04/ 

Then 


limf(n)= S+ (/_ S)\,m M " 

where limn M 15 a matrix each row of which is the probability vector 
u = (i, si, si) obtained by solving the equation v ~ vM Thus 

/0 911 0 053 0 036\ 
p (") = ( 0 089 0 625 0 286 ) 

VO 067 0 319 0 615/ 

Note that the limiting probabilities are not independent of the initial 
industry, a reflection or the non-Markov, an nature of the model and, pre- 
sumably, of the relative stability of employment in the various industries ► 


J.rae/nrfirrrWA tMayofU.borasaP'obMV 
195!) vol 6 Uni "" s "!' S,udl " "i Industrial and Labor Relations, Ithaca, N V, 


632 / MARKOV CHAINS 



Although Theorem 3 ensures that a Markov chain will eventually pass 
into a closed class, to remain there forever, the behavior of the process within 
the open classes is itself interesting and important In many applications ot 
Markov chains to behavioral processes, the process is assumed to begin m 
an open class Questions are then raised as to the length o time spent wi in 
the open classes and the probabilities of becoming trappe m one or a 

other of the closed classes , « 

The simplest types of chains having open classes are t ose in w 
closed communication class contains only a single state uc c o e 
are called absorbing and the same term is applied to the state wi in 
For an absorbing state S t the transition probability p u mus equa 1 » 

the process, once it enters S„ can never leave (Of course, t P 

never exit from any closed class, but the term absorbing is reser 

state classes) The entire chain is called absorbing if all closed 
absorbing 


Example 1 (a) In an attempt to apply the theory o transl 

migration into and from California, Rogers* obtained e 
tion matrix 



Calif 

US 

Death 

Calif , 

/0 9041 

0 0331 

0 0628’ 

US ( 

0 0068 

0 9352 

0 0580 

Death 

Vo 

0 

1 

as an absorbing state needs no fu 


The notion of death i 

(b) Adapting and extending Rogers’ ' l ' ork '®' lo S "assumed that older 
sorbing Markov chain model of social mobi 1 y social class than 

People are more likely to stay in an occupation, regi > ^ process via a 
are younger people, who are assumed to enter e mo rtel consider the 
nonabsorbing state As a simple illustration o 
Allowing transition matrix 



*Ro? 
0 j 

o 

til 


. Center for Pkmninc = nd 

£»* A,"A Markovian Model of InlerrcponalMjEra , , 96 < Cilrf 

^' ! °Pmen, Rcveareh. Ilerkelev, Calif ,, rcl , New V«k 

Prevv New Votk, 


(mimcoprapnev* * j, fcw fscvv 
Ckjp, 'forfeit of Croup Brhauor (Columbia nil 


° I . Slmplr Models of Croup Belts 


20 a CHAINS 


{or (Columt ta Univervip I 

WITH OPCN CtASStS / »» 



retained to do the study Their results are presented in the flow diagram 
in Figure 7 For instance, the transition CN -*■ NC indicates a purchase 
at a competitor’s given that the previous two purchases were at a com- 
petitor’s and Nichols’, respectively Treating the new process as a 
Markov chain with states NN, CN , NC, and CC, find the annual sales 
that can ultimately be expected Does the recommendation differ from 
that made on the basis of the initial study 0 


FIGURE 7 



10 Consider a two state Markov chain with transition matrix 


*1 s 2 



Form a new chain whose states are pairs of states of the old chain using 
the rule that the new chain is in state s s, on trial t whenever the original 
chain is in states s, and s„ respectively, on trials t — 1 and t This new 
chain is called the expansion of the original chain 

(a) Compute the transition matrix for the new chain 

(b) Show that the new chain is regular 

(c) Find the fixed probability vectors for both the original and the new 
chains 

What is the relationship between these fixed vectors 0 


20 4 CHAINS WITH OPEN CLASSES 

Wc now Know that in a Markov chain consisting of a single communication 
class there is a limiting behavior which is independent of the initial prob- 
abilities of being in the various states The limiting vector n is found by 
soiling the equation „ = cP, where P ,s the transition matrix for the chain 
For chains with both open and closed classes, these results apply only after 
,h process „ a! entered a closed class Thus, in order to complete our study of 
tarko\ chains, wc must consider questions concerning the behavior of the 
process within the open classes The following is a key result, which we 
state without proof 


Thoorom 3 In a finite chain, there is probability 1 
c\cntua!ly enters a closed chss 


that the process 

► 


634 / MARKOV CHAINS 



In terms of the original matrix P then, 

I™/' = (<r -rr'A o) 


(ii) 


Example 2 Suppose that in Example 1(b) we find a -\,b - i,c 20 , 
d = fa e = fa, and/ = T e o giving the transition matrix 



Then 


v—t) = ( * ).v-Tr'-i 

\“1 <5 To/ 


/40 10 
v20 50, 


lim P n =| 


giving 

t\ 0 0 0\ 

0 10 0 

I 5 0 °, 

vl! 0 °/ 

Note that the dements in each row of (I — T) at sum to 


lim, 


. , r. Thus the element 

An absorbing state S L , once entered, can never ■ trix (H), 

_ n ./„v .h« li t-1 nosition of the liminnb 


6 S,„ once entered, can never m;ltrix (U) , 

*-*Pi t('0, appearing in the (/, k) position eventually 

represents the probability that the chain, starting i js probab ility 

becomes “absorbed” in Sn. For instance, in Examp > ma ^ c occ upation 
5 that an individual who begins in occupation Ji ecl other elements 

bis life career. Similar interpretations may be ma 
° r (( - T)~'A. itself has an important 


ther 


, , r 3 4- • • • itself has an important 

i matrix (7 - T)~' = I + T + T' + T + ^ cIcm e n , ,„(«) in 

erpretation in terms of expected values. R process is in state 

*= O'../) Position of T represents the probability 5(0)) . 

1 ° n *hc /jth trial, riven it started in state S, a process is 

able X,W to be the 0 valucs-1. if 

not in 5, on trial "■ 


- vtj/ position oi l " represents the proba i ll > \ ( cxcn i 5,(0)). If 

* °n the nth trial, given it started in state o, a times the process is 

* c define a random variable ««) to be the xatucS - U if 

,n ""to s, on ,r!al it is obv ious that X,W can take only : ., 

’= process is in S, on trial ir, and 0, if the procc 
* v interpretation «r imnlics that . .*.» 

. /’(.\» - 0 ! 5 - <0)) 


interpretation or / M (w) implies that 

i.,(n) „ />(, Vj („) « i | 5,(0)) and 1 - ‘;M 


20 4 CHAINS WITH OPEN CEAS 


/ 637 



Here states refer to occupations To say that an individual is in state s j 
means that he has occupation Si but has not decided to make it his life career, 
while, in state 5,, an individual not only has occupation s lt but has decided 
to make it his lifelong career Similar remarks apply to s 2 and S 2 From a 
practical point of view, of course, considering Si and S 2 as absorbing 
requires the somewhat questionable assumption that decisions regarding life 
career are irrevocable ^ 


In a manner similar to Example 1(b), the transition matrix for an absorbing 
chain having r absorbing states Ai, A 2 , , A r and t states Si, S 2 , , S t 

in open classes may be written in the form 


« 


(9) 


where / is an r X r identity matrix listing probabilities of transition among 
the absorbing states, O is an r X t matrix of zeros representing probabilities 
of transition from the absorbing states to the other states Si, S 2 , , S h 

A is a t X r matrix of probabilities of transition from each of the open 
states to the absorbing states, and T is a / X t matrix whose entries represent 
probabilities of transition among the states S\, S 2 , , S t 

When P has the form of Equation (9), it follows that 


( 1 °) 

P'J 1 °) 

\A + TA T 2 ) 

\A + TA+ T-A T z ) 


and simple induction shows that 


■-( 


A + TA + + T*~'A T n 


0 ) 


°) 
r n J 


U + T + + T'-'jA T' 

Theorem 3 implies that the sequence of elements in the (i/) position 

o powers o T, must converge to zero for each pair (i j) We indicate this by 
writing hm... (r-) = O, a , X r aero matm 
As for the term (/ + T + + r— ), ,t ,s easily seen that 

(/ — T)(I + 1+ + jn-l) = / _ jn 

Since Iim„_„ (r") = O, we find 
T)(f+ T + 

so that 

hrn (/ + T + + T 11 - 1 ) 


+ T" ')}=/_ J lm _ / 


E T 1 = (/ - 7 j - 

fc =0 


636 / MARKOV CHAINS 



a total of 9.918 years, before being absorbed by death or a mental hospital 
The prediction that a person who is severely insane but unhospitalized (i' 2 ) 
will remain in that state for 5 158 years, on the average, before being absorbed 
while a person who is only mildly insane but unhospitalized (S.) can _ be 
expected to remain in that state for only 3 525 >ears e ore 
appears unreasonable and tends to detract from t ic ere i 1 1 ^ 

model 

We have not discussed chains with an infinite number of ^tes although 
these occasionally arise Results for such chains are suni , , 

above, although some additional detail is required ue o e ^ 

matrix (/ - T) may not have a unique inverse Details may be found m the 
book by Feller listed under Supplementary Reading 

PROBLEMS 

1. Compute (7 - T)~ l for the absorbing chain with transition matrix 

/I ! 0 _0\ 


°lt f) = C r) 
Vi ' § w 


'2 ! § i' 

. /r 7 ”v— i where T is the matrix 

2. In Problem 10 of Section 20 1 , compute t. ’ Fjnd thc expe cted 

of transition probabilities among states in i op j. p0SS ,bIe outcome 

number of ballots required and the probability ol P 
of the process 

3 For the balloting process, compute the following • q ““ are 

(a) The expected number of ballots on which all three 

active contenders . n 

(b) The probability that C withdraws before cit cr 

(c) The probability that A withdraws before cit cr 

(d) The probability that D and C withdraw on the same ^ 

4. In Problem 9 of Section 20 1, find the two minulcs 

clerk can take a break, given that the last customer 

ago it 

5. In Example 1(a), what ,s the probability that a Cahfornn res, en. " 

leave thc state 7 A 

1* Kb) show that (l ~ 

6 For the social mobility model of L Example 2 ) 

is a transition matrix (7* and A arc defined ns in 

- (l-T)-'-l 


2- Use Equ-mon (10) to show Ihit (7 - T) T- 


- use Lqintion llUj 10 snow ^ j|oyn 

In Problem 8 or Section 20 1, find ,hc y before p-num: lo 

eluding time zero) that the process remains 
Mate A 


WITH OPCH CtAPSCn / •» 


204 CHAINS 



so that 


E(Xj(n) | 5 ,( 0 » = 1 U 3 (n) + 0(1 - /„(«)) = /„ 00 
The total number of trials (including trial zero) on which the process hits 
S 3 before passing into an absorbing state is 

X, = X,(0) + *,( 1 ) + X } (2) + 

Hence, when the process starts in S„ the expected number of times it hits 
S } is 

e(x, | s,m = i: £(*,(«) I s,(o» = i; t„(u) 

n=0 n=0 

This is the element m the /, j position of / + T + T 2 + T 3 + or > 
equivalently, the element in the /, j position of (/ — T) -1 
From this it follows that the expected number of trials on which the process 
remains in the open states starting from S t is obtained by summing the 
elements in the rth row of (/ — T)~ l 


Example 3 Let us suppose that the absorbing chain of Example 1 of 
Section 20 2 has the transition matrix* 


State in (i *f l)st year 




/s 4 

s 3 j 

S 2 

S, 

So \ 


s,t 

1 1 

0 | 

0 

0 

0 


S 3 

0 

1 

0 

0 

0 

in ith 

s 2 

0 054 

0 140 1 

0 806 

0 

0 

year 

Si 

^ 0 027 

0 257 

0 

0716 

0 


So 

^0 097 

0 

0 004 

0 003 

0 896/ 


-U 


Then 

( 5 158 0 0 \ 

o 3 525 0 ) 

,0 211 0 106 9 601/ 


so that, according to the model, a sane person, on the average, spends 
9 601 years in state S„ 0 106 year in state S„ and 0 211 year in state S 2 , 


•Based on estimates provided by Marshall A W and Goldhamer H An Application 
ot Markov Processes to the Study or the Epidemiology of Mental Diseases Journal of 
the American Statistical Association 50, 99-129 (1955) 


638 / MARKOV CHAINS 



States ss and s 4 represent the animal’s final orientation The experiment 
is so conducted that an animal always starts in state so 

(a) Find the probability that the process is ultimately absorbed m state 

S3. 

(b) Find the mean time to absorption 

(c) Find the probability that the process returns to state jo exactly 
times before absorption. 


FIGURE 8 



13. Prove that the expected number of changes ‘"^y se°tingp,! = 'o 

can be calculated by forming a new matrix i(J new row sum 

“o" 

for the original process 

14. For the model of choice reaction in Problem 12, compute the expe 
number of changes of state 


20.5 APPLICATIONS IN SOCIOLOGY 

Sociologists are frequently concerned with the fo || 0 ^ in g example 

structural state of a social system changes over im . characterized 

illustrates the use of a Markov chain model for a social process 
by the independence of path assumption. 

Example 1 The relation “dominates is ordl JJ® .'r dominates y, then 
reflexive (no person dominates himself), a sy 111 me , cs ,, an dy dominates 

J docs not dominate x), and nontransitive 0 x ° s ( ia || a ssumc that 

*■’ onc cannot necessarily conclude that x domma e ^ population For 

me relation exists between all pairs of mem ers non to three-member 

simplicity of expos, non, we shall restrict our attention 
groups*. The notation x—*y will mean x , h : c ' c . mcm ber group there 

Eton, the above assumptions, it follows that i"> » 
arc only eight distinct dominance patterns (see ind ,„ dlu | ,, denoted by 
Emcn any dominance pattern, the authority o Thus in Di“ c 

I lo .h. nnmher of persons whom 1 


Tom the abo\e assumptions, it follow’s t a c ^ 

only eight distinct dominance patterns (sec j n( j, vu ._ . 

.^ tVcn any dominance pattern, the authority o ateS Thus in 0 3 ' 

i. is defined to be the number of persons w 10 

A ‘ “ 2 ’ A > “ >• and A ' “ °’ „ , „ (CotumhJ U».«n«> 

from Ilartm. O J . Srmpfr AtaM* •’/ 

York, 1967). Chapters 5 and 6 


A ww 

Hv 

‘Alia; 


APPLICATIONS 


, H SOCIOLOGY / 



9 A Markov chain starts m state S, Let X , denote the number of trials 
(including tnal zero) the process remains m S, Prove that E(X,) = 
1/(1 - p„), where />„ is the probability of transition from S , to S, 

10 For the assignment process of Example 2 of Section 20 2 compute 

(a) (/ — T)~\ where T is the matrix of transition among the states in 
open classes 

(b) the expected number of times that the process remains in the open 
classes 

(c) the probability that the process reaches state (0, 0, 2) before 

Cl) (0, 2, 0) 00 (0, 2, 0) or (0, 1 , 1) 

(Hint Make all of these states absorbing ) 

(d) the probability that the process ultimately reaches state (0, 1, 1) 

11. Adelman* described an application of Markov chain theory to the deri- 
vation of the equilibrium size distribution of steel firms in the United 
States The range of total corporate assets was partitioned into seven 
asset classes which constituted the states of the growth process The 
transition matrix of the chain was given as 



SO 

Si 

S2 

S3 

S4 

S5 

J6 

so 

10 99942 

0 00040 

0 00016 

0 00001 

0 00001 

0 

° \ 

Si 

0 021 

0 911 

0 068 

0 

0 

0 

0 

S2 

0 024 

0 039 

0 908 

0 028 

0 001 

0 

0 

P = S3 

0 

0 

0 076 

0 872 

0 052 

0 

0 

J4' 

1 0 008 

0 

0 

0 016 

0 947 

0 028 

0 

S5 

\° 

0 

0 

0 

0 037 

0 926 

0 037 

so 

\0 

0 

0 

0 

0 

0 024 

0 976/ 


(a) Find the average length of time (years) that a firm will spend m 
state so, state S3, or state sc, once that state is entered 

(b) Find the probability that a firm starting in state si moves first to 
state S2 and then to one of states S4, s 0 , or so 

(c) Find the conditional probability that a firm moves to state S2 given 
that it leaves state s 0 

12 Bowert used the simple Markov model in Figure 8 to describe the choice 
behavior of an animal in a T maze The states s 0 , si, and sz represent 
the animal s orientation toward stimulus sets S 0 ,S 1, and £2, respectively, 
w ere So denotes the stimuli available when the animal is oriented 
straight ahead, S 1 is the set available when the animal is oriented to the 
right, and S 2 is the set available when the animal is oriented to the left 


•Adelman, Irma G “A Stochastic Analysis of the Size Distribution of Firms,” Journal 
of (he American Statistical Association 53, 893-904 (1958) 

tBower G H , ‘‘Choice-Point Behavior ” in Studies in Mathematical Learning Theory, 
Bush, R R , and Estes, W K , eds (Stanford University Press, Stanford, Calif, 1959), 
Chapter 6 , , . 


640 / MARKOV CHAINS 



To illustrate the computation of P, suppose the process is in state D z 
Then transition to Do, D t , D-,, or D s would require at least two dominance 
reversals, which is not possible in a single encounter Transition to 
requires that * meet z (probability J) and that z dominate v (probability 
J(1 - 2iv), since in D a the authority of z is 0 and that of x is 2) Hence 
transition from £> 3 to Z>! carries probability a = JO - 2l1 ') Other transition 
probabilities are computed m a similar manner 
The limiting vector is 


« = (ft p, y, y, y, y, y, y) 

where <3 = (1 - 2.v)/4(2 - ») and 7 = 1/4(2 - w) Thus, no matter what 
the initial state of the process, the limiting probability o emg ^ in one 
equal, tar, an states D ‘ or D 2 ,s 2P - (1 - 2*V(4 - 2w), white the prob 
ability of being in one of the authoritarian states D 3 , » s r 

3/(4 - 2»v) Since 0 < » < J, it follows that the limiting protab hty /> 1 of 
an authoritarian structure lies between f and 1 , the va ue 4 u — i 

v. hen no weight is given to differences in individual aut onty „ fnrmi 

"e find p A = 1 In this case, o = 0, (3 - 0, and states D„ , D, To™ a 
nngle closed communication class into which the process ^ 

pass 


problems 


1 Verify the computation of the transition matrix P 

2 Go through the calculations to obtain the limiting vcct 

3 Argue that when n < the chain consists ^ opcn nn d 

mun.cat.on class, but that if h = i there are three classes 

one closed, all aperiodic 

4 In the special case » = i, find the expected length or lime 1 c proce 

'vill spend in the open classes ^ lw0 

5 Trom the point of view of distribution of member dominates 

distinct dominance structures In Pi an - dominates both the 

one of the others, while in Pj, * D * onc . ntcs 5 , « {Pi. D *' 
other members Let us define a new process ^ 5| |f thc original 

and = {fl, , Dsl The ■* ,, chl ,„ ,e in m> or the 

chain is in Pi or P 2 and is in state S 2 itself a Markov chain 

states D,. , P, Show that this new process is Use 

"ith transition matrix 

sl'v, 

^htre a /», and r ire defined in Txamplc 1 |^ crL pro’ul * - 41 ^ 

the original chain is in slate Pi or in sia c ,( t ^ c chain tv 
Moving to onc of the states Pt • ' f . ir - tociiher Pi or 

or P, ,P. .here is pota* hr p*n n * 

(Markov chains arc called tump J 1 

SOCIOlO° v / <<5 


I (l i ) 
- V l,+ 2C ' 


20 5 APPLICATIONS IN 



FIGURE 9 



Z>5 D(, D-i 


We take the point of view that the dominance pattern which exists among 
three individuals is a reflection of the outcomes of past encounters between 
pairs of these individuals and is subject to change through subsequent 
encounters Thus, to say that jD 3 is the existing pattern is to say that at the 
last encounter between x and y, x was dominant, that at their last encounter, 
x dominated z, and that y was dominant when y and z last met 
The next encounter between a pair of individuals may involve x and y, x 
and z, or y and z, and we assign probability § to each of these events The 
probability that i is dominant at a meeting between individuals i and j is 
defined to be 

P>i = Hi + - A,)] 

where w is a constant lying between 0 and \ When w = 0, each has prob- 
ability % of being dominant, while positive values of »v reflect in varying 
degrees the difference in authority between the two individuals 
Putting all our assumptions together, we see that we have a Markov chain 
with states D\, D 2 , , D» and transition matrix 



where a = £(1 — 2w), b — \(3 + 4w), and c — £(1 — w) 
64Z / MARKOV CHAINS 



Kreweras * in developing a model for this situation, assumes two classes 
of voters-those who are resolute m their convictions and who thus vote 
each time for the same alternative and those who ar e floaters in the sense 
that they are influenced by the results of the preceding ballot 
To take a specific case, suppose there are two alternatives five reso- 
lute voters divided two-three in favor of the respective alterna iv , 
and three floaters To win, an alternative must receive at least five of the 
eight votes Since the voting pattern of the resolutes is feed, the state of 
the voting process can be indicated by a pair (jt,j) m ‘f® , 

bers of floaters voting for the respective alternatives ’ . - 

must equal three We shall assume that on any particutar to oh a 
floater votes for a particular alternative with probabi i y p P 
to the total number of votes cast for that alternative on he precedms 
ballot For instance, ,f resolutes vote (2,3) and floaters Cr, y) on one 
ballot, then the probability that a floater votes for the first al ernatw 
on the next ballot is (* + 2)/8 On the first ballot, alternatives 

chosen at random by the floaters . with 4 states. 

Argue that these assumptions lead to a Markov determine the 

3 of which are absorbing Write the transition matrix and determine 
probabilities that each of the respective a ternatives 
10. Rework Problem 9 assuming three alternatives and seven voters with 
four resolutcs divided two-one-one Majority wins 

H. A tentative model of the spread of states m ^ n'di vfdua^ gro u p member 
Karlsson t He assumes that the behavior o ^ degrefJ of , n nucnce 
*■ during time period (trial) t + 1 depen P ^ bchavl0r of , he group 
that each group member has upon k and P oeriod 

members, himself included, during the prece in the matrix 

A measure of the behavior of the group p^pon of 

m = (*„(:)) m which the u element Wfl i "d, cates 
time period t that individual / spends in * . , n w hich element 

influence patterns are indicated by the ma ■ ” 0f j , n d,vidunt A The 

ot, represents the influence that indivi u necative and so that 

coeffi cents are normalized so that all «. ore non-negat.ve 

a kt = 1 for each A 

The basic assumption of the model is a ^ ^ 

R(r + 1) ** A ' 

. ,n hchavior state 1 b> 

which states that the proportion of time S P 
individual A at t.mc (r + 1) ■* a weighted averse 

b k ,o + 1) - s «*•<>..(') 

stated b> each of the group 

or the proportions of time spent in bchauor s . coc nicn.nls flit 
members at time r. Ihe weights bene the influenc ^ ^ 

G • " A Model lo Weight Indwidoal ° H Ml* 

Sclencej, Sternberg. S , Ed (Mouton, • rrmim" /l-'Vlur >‘f 

' *lfc.V G : "hlole on the Spread pf n Slate in Small Socu 


■. a-* . on me opiwu « 

/ niopAji/ei 17, 1-5 (1955) 


205 APPUCATIONS 


sociology / a 45 



and a state of conditioning random variable 


C n = 


if s is conditioned to ri on trial n 
if s is conditioned to 1 2 011 tr,a ^ 11 


From the above assumptions we have 

P(R n = 1 1 C„ = 1) = 1 and P(R„ = 1 | C„ = 2) = 0 

We assume that P(E n = 1) = ir and P(E n = 2) = 1 - x where * is a 
constant not dependent on trial number or on the outcomes of previous 
trials (independence of path) In particular, reinforcement on a tna is 
t to be contingent on the subject’s response on that trial 
. * * _ j-. ir* thf* rpsnnnse \ 


assumed not to be contingent on the subject’s response on umi uwi 
If the stimulus element s is already conditioned to the response w ic is 
reinforced, it is assumed to remain so conditioned If it is not conditione o 
that response, it becomes conditioned with probability 6 (the conditioning is 
then said to have been effective), while with probability 1 - e no change in 
conditioning occurs c » 

Our assumptions imply that we have a Markov chain with states 1 ' 

^2 corresponding to the two responses to which s may e con 1 10 
Computation of transition probabilities is illustrated in the tree diag 
frgure 10 If s is conditioned to response r , (state Si) the su jec w 
•hat response With probability r, r, is reinforced, in which case s w g* 
he conditioned to ri and state Si will occur again 


f| GUre io 


s. 



h-lll h c Prt,blb 'h>y 1 - x, r 2 is reinforced If the rcinforcenicm ^ ^ J|1te 
S ..‘h'hh 0), s becomes conditioned to r; .md t ic p obi tin 

’ 0l hcruis c (probability 1 - 0). side S, occurs -.pun 


APPLICATIONS IN psvchoioov / 



(a) Use Equation (12) to show that 

B(t ) = A‘B(Q) 

where A 1 denotes the /th power of A, and i?(0) is the matrix of initial 
time proportions 

(b) A set C of group members is called closed if for each k £ C and 
i £ C we have a*, = 0 A group is called irreducible if there are no 
closed subsets of group members other than the entire group 

If a group is irreducible, prove that 

(i) A 1 converges to a stochastic matrix M each of the rows of which 
is the same probability vector 
00 lim,-, B(t) = Mfi(0) 

(in) B(t + 1) — B(t) —* O as / -* eo 
(iv) Max, b ,(0) > b ,(/) for all j and all / 


20 6 APPLICATIONS IN PSYCHOLOGY 

In psychology, Markov chains have been most widely used as models of the 
learning process The examples in this section illustrate the basic ideas 


Example 1 In stimulus sampling theory* it is assumed that there is a set 
5 = (sj, s 2 , , s t } of stimulus elements of which exactly one is sampled 
on each trial The probabilities of sampling the various elements remain 
constant from trial to trial Each stimulus element is assumed to be con 
ditioned to exactly one of a set of responses R = {r lf r*, , /■„,} On each 

trial the response given is the one to which the obtained stimulus element is 
conditioned 

To take the simplest case let us consider an experiment in which a single 
stimulus element j is presented to the subject on each trial (see Example 2, 
Section 18 6) and in which the subject makes one of two responses r t or r 2 > 
one of which is reinforced For n = 1, 2, 3, ,t is convenient to define a 
response random variable 

if response r x occurs on trial n 
if response r 2 occurs on trial n 
a leinforcement random variable 



E n = 


if response r x is reinforced on trial n 
if response r 2 is reinforced on trial n 


*For a complete discussion see Suppes P and Atkinson 
for Mulnpermn Interactions (Stanford University Press 


R C Markoo Learning Models 
Stanford Calif, 1960) 


646 / MARKOV CHAINS 



there is a different transition matrix Ti corresponding to each alternative 
A t Thus if v(n ) is the probability vector at time n, and if alternative 
At ( k = 1, 2, , r) is chosen at trial n, the new vector of probabilities at 

trial (n + 1) is 

v(n + 1) = v(n)Ti. 

The inherent difficulty in the analysis of the model lies in the fact that the 
sequence of transition matrices used is itself determined by c ance owever, 
some general results may be obtained and certain special cases are re a ive 

easy to analyze , , } l 

Let us consider the simplest case of two alternate es i an 2 
corresponding transition matrices 


Ti 


-C;“ ,!'J - -C::' .-0 


Defining Cl = 1 - - bl and X, = «,/ 0 - *«). « 15 eas,l E venfled ‘ hat 

(1 3) 

v(n)Ti = Civ(n) + 0 - C *) A > 

where A, is the vector (X„ 1 - X.) The usual models require that ci be 
non negative * . . _ T 

Equation (13) shows that the effect of applying the ‘ rans th ° e ™ a c i or A] ‘ 
,0 a vector u(n) is to give a linear combination of MW new vector 

Smce both c, and 1 - Ci he between zero and urn y, c]oser t0 

^ + 1) = o(n)T i lies betiieen v(n) and A! and, m pat 1 • ^ obtam 

V than i(„) is In fact, if T, is applied to v(n) k consecutt 

L(n + k) = u(«)r[ = c5t(") + c - (1 1 

co, converges to A, Thus A, « . U* >— 
arkov chain whose transition matrix is T i ur 0 f t h c equation 

chain theory tells us that A] should be the unique s y 

c - ur, Put, rng t(„) = A, m (13) shows that this is indeed the 


blems 

4 


EMS 

Verify the statements made, but not proved, in Example 2 

(a) dome Equation (13) between r (n) and 

(b) verify that e(n + 1) - U")Tt actually docs 
At 

(c) pro\c (14) by induction 

(d) \trif> that \i is the onl> solution of t c * r 

r , in rumple 2 concerning T i 

5 Argue that all the statements made in 

direct counterparts for Tj ^ j r «r . 

i* \ R B and Mtmellcr. T Suxhjuic \lMsf rl-' " r ‘ ,h 

19JJ, 

VCHOIOOV / eJ ’ 

IONS IN ps 


20 6 A P P L I CAT 



the first line in the transition matrix 


From s 
conditioning 
state 




To state 
S\ S% 

• 8 + dir 6 — 8ir 


S 2 \ o* 
The second line may be computed similarly 


1 ■ 


07T, 


PROBLEMS 

1. (a) For the transition matrix P of Example 1, compute P 2 , P 3 , P 4 , and, 

by induction, P n for all n > 2 (Hint See Problem 25, Section 10 3 ) 
(b) Find lim»-,«, P” 

2. (a) Prove that P(R n = 1) = P(C n = 1) for all n 

(b) Use the result of (a) and of Problem 1 to find 

U n = P(R n = 1) n = 2, 3, 

in terms of t/i = P(R\ = 1) 

(c) Find lim„_» w U„ 

3 (a) Find the mean and variance of R, 

(b) Find Cov (/?„, R m ) for m / n 

(Hint E(R n R m ) = p(R n = l, R m = i)) 

(c) Prove that 

is*Ii .?/*]- 

Example 2 The learning model of Bush and Mosteller * while not strictly 
a Markov chain, is closely related The model considers a choice situation, 
such as that facing a rat in a T-maze, in which there is presented a set of r 
alternatives A,, A 2 , , A r from which the subject must choose exactly 

one This set of alternatives is repeatedly presented to the subject in a sequence 
of trials 

The state of the system at trial n is the alternative chosen on that trial 
The probability that alternative A k is chosen on trial n will be denoted by 
v k (n) and these probabilities arrayed m a probability vector 

v(n) = (t?i(n), o 2 (n), , Ur («)) 

The difference between this model and an ordinary Markov chain lies in 
the assumption that, rather than a single transition matrix for the process, 

*Bush R R .and Mosteller, F , Stochastic Models for Learning (John Wiley & Sons, Inc , 


648 / MARKOV CHAINS 



CONTINUOUS TIME PROCESSES 

AND CONTINUOUS RANDOM VARIABLES 



21 1 THE POISSON PROCESS 

Jj 1 tlle Bernoulli process we considered occurrences of a phenon ^ t j, ls 

success” at points on a discrete time (trial) scale , . > > tinuous 

c[ rapter we turn our attention to phenomena which occu voters at a 

Sf Sca,e ' * 0 Simple examples are the successive a( whlch 

Polling place, successive purchases of a brand item, 
au tomobiles pass a particular point on the highway ^ wrl t e X t to denote 
c assume that observation begins at time r = an intcna l (0, r] 

r number of occurrences of the phenomenon in occurrences 

To = 0 and the difference X„ - X. gtves the number 
in the — 


inters al (j, h] (See Figure 1 ) 


figure • 


1 1 Counts the number of 
recurrences in this inters al 


^ — \j counts the 

(Occurrences here 


g.'V j 13 ' 0 a random \anablc X t corresponding lo , f unc tions of 

3Ving\ ari0 u Sassum pt 1 ons about the joint pro ».| s for such pro- 

^ ^ b, cs uc gcncratc Nan ous prob‘b,ht> lbc Poisson 

. C !n thu section «e shill discuss the simplest suen i 
1 sshtch IS J ■' - •vinimns 


• "Inch is gosemed by three assumptm , .icfumedcpenJ' 

**> 5 armroriry—Tlic number of occurrences in. m mtrt | lV1 non Ihus if 

c ”' 1 ' on ,hc length of .he interval and not on 


211 poisson pnoctss / »’ 



6 We have seen ihat lhe e filet of applying transformation T\ to a vector v 
is to move closer to the vector Ai Similarly, applying T 2 moves us closer 
to the vector A? satisfying A 2 = A 2 T 2 What happens when we apply 
first Ti and then T» n How do 1 T 1 T 2 and 1 T 2 T 1 compare 7 

7. Tlie results of Problem 6 show that the order in which transformations 
are applied affects the final result and, m particular, that it is the last 
transformation which has the greatest single effect on the present prob- 
ability vector This difficulty would vanish if the transformations should 
commute llowivcr, show that commutativity holds only if the trans- 
formations have the same limiting vectors or if one is the identity matrix 


SUPPLEMENTARY READING 


F tiler, W , An Introduction to Probability Theory and its Applications (John Wiley 
& Sons, Inc , New York, I96S), Vol I, 3rd ed , Chapter XV. 

Kcm-my, J G , and Snell, J L , Finite Markov Chains (D Van Nostrand Com- 
pany, Inc , Princeton, N J , 1960) 


«50 / MARKOV CHAINS 



■k] 


(2) 


of care involved in accidents. For here it is quite likely that we find multiple 
occurrences within a short interval, in fact, simultaneously. 

To derive the probability function of X> we write 

Xt+h = Xt + (Xt+h — Xt) ^ 

and partition the event ( X t +h = n) into 

n 

(X l+h = n) = U l x t = k < X ‘+ h “ X ‘ ' 

1=0 

Equation (1) expresses the fact that the number of occurrences up to ti 
t + h is the number up to time t plus the number which occur dween i » 
t and t + h. Equation (2) says that if there are n occurrences by umtt + n, 
then in time t we could have zero or one or ... or n, whi e e r 
come in (/, t 4- /jl. . , 

Writing p n (t ) for the probability P(X t = ”) n occurrences in > 

ha\e, using assumptions 1 and 2, 

PnV + A) = £ MO * 

1=0 

But from assumption 3, p n -i(l>) = o(/i) for n - k > 2, so for « > 1 
P»( l + ft) - o(ft) + p„_i(()[kh + °W1 + X 

= x/ip„_i(r) + (1 - + o(/,) 

Millie 

Po(< + ft) = (1 - ^')po(') + «(*)* 

. Subtracting p„(r), dividing by ft, and letting ft — we obta ' n thC 
,a * equations 

P'M - ~ ftpo(') 

p'M = -Xp„( 0 + XPn- iW ” - 1 
'buch must be solved to obtain the desired probability function 
c first equation may be written 

(0 + X)Po(0 = 0 

~ *°^ion of which has the form 

Pod) = c <*~ u 

5 0, wc have pc(0) = l s ° that 

i*oW _ c converging to 

Envscs, the probability of no occurrences dccrea 


^ec; 


** 1|-W« , 
1:* E 

*S- 


«-i _ *) a function such that (o(A) M 


‘ ‘V»J tr 


to enc. 


o. all .uch ran:n«-.‘ «- * 

pnoccss / c$a 


21.1 


POISSON 



u - s = I, the random variable Af„ - A, has the same probability 
function as Xt 

(2) Independence — The numbers of occurrences in two disjoint time 
intervals are independent That is, if t < s < u < v, then the random 
variables X, — X t and X v — X u are independent This assumption is 
described by saying that the process has independent increments 

(3) There exists a positive constant X such that if /us small, 

(a) the probability of no occurrences in (0, h] is approximately 
1 - X/i, 

(b) the probability of exactly one occurrence in (0, h ] is approximately 
X/i, 

(c) the probability of more than one occurrence in (0, h] is negligible 


More precisely, we assume that 

P(X h = 0) = 1 - Xh + otf) 

P(X h =\)= \h + o 2 (h ) 

P{X h > 2) = o 3 (h) 

where o j, o 2 , and o 3 denote functions which are negligible relative to h 
when h is small That is. 


fc-o n a_o It 


lim 


os(h) _ 


In many behavioral situations, these assumptions seem quite reasonable 
For example, automobile accident rates remain fairly constant over time and 
those fluctuations which occur may reasonably be ascribed to chance effects 
The probability function for number of accidents depends on the length of 
the time interval involved while, within limits, the number of accidents in 
one interval is unaffected by the number in another, nonoverlapping, interval 
The rationale behind assumption (3) is roughly as follows Let X = E{X i) 
be the expected number of occurrences in a unit time interval Partition this 
unit interval into N subintervals of equal length h = \/N The probability 
of at least one occurrence within any one of these subintervals is 


1 - P(X K = 0 ) 

so the expected number of subintervals containing an occurrence is 
7,[1 - P(*A = 0)] 


Intuitively, one feels that as h — 0, this number should approach the expected 
number X of occurrences in the unit interval and this is the crux of assump- 
tion (a) Implicit in this argument is the assumption that the probability of 
two or more occurrences in any submtcrval is negligible as h — » 0 Obviously, 
such considerations may provide a model for accidents but not for numbers 


652 / CONTINUOUS PROCESSES AND VARIABLES 



of cars involved in accidents For here it is quite likel> that we find multiple 
occurrences within a short interval, in fact, simultaneously 
To derive the probability function of X t we write 

X t +h = X t + (Xt+h ~ X t ) 

and partition the event ( X t +h = n) into 


(D 


(X, +fl = B) = u [X t = k, X l+K - X, = n 
1=0 


k] 


(2) 


A.=l) 

Equation (1) expresses the fact that the number of occurrences up to time 
f + ft is the number up to time / plus the number which occur between tunc, 
t and t + h Equation (2) says that if there are n occurrences by time j- , 
then in time t we could have zero or one or or n, whi e i e rem 
come in (t, t + ftl , 

Writing p n (t) for the probability P(X t = «) of n occurrences in time , 
have, using assumptions 1 and 2, 

PnO + h) - S PdO Pn-dh) 

1=0 

® ut assumption 3, p n -k(h ) = o(h) for n — k> 2, so for n > 1 
M< + h) = 0 (h) + + 0 (h)] + />»(')[> -kh + o(h) ] 

= + (1 - 


while 


Po(‘ + l>) = (1 - kh)po ( 0 + 


I V 

Subtracting p„(r), dividing by A, and letting It -> 0, we obtain the dilTeren 

t,al equations 


p'o(l) = —kpoiO 

p'„(t) = — xp»(0 + x/bi-iW ” - 1 
"hich must be solved to obtain the desired probability function o i 
fhe first equation may be written 

(2J + X)po(0 = 0 
w ' u tion of which has the form 

Po(0 “ CoC 
■» we have po(0) = 1 so that 

pM - 


^ nce *fo = 0, 


^P^ses, the probability^ no occurrences decreases, comcrging to 

Ja ' oto » function such thru lim,-o W«/*l - 0 !uch 
^ irto one 

potssoN pnoctss / 



Each of the remaining equations has the form 

(D + X)p„( l) = Xf>„_i(r) 
and the equations may be solved successively Thus 

(0 + X)p,(l) = XpoW = Xc _x ‘ 

Multiplying by e xt gives 

Dl^PiU)] = X 

from which, since pi(0) = 0, we find 

P»(0 = xre _x ‘ 

Continuing in this manner gives 

/>„(/) = = n) - X ‘ n — 0, 1, 2, < 3 > 


as may be verified by induction 

For each fixed value of t , the random variable X t of which (3) is the 
probability function is called a Poisson* random variable with parameter Xf 
The entire process described in assumptions 1-3 is called a Poisson process 
with rate, X 

Since the series expansion of e u is 



it follows that the probabilities in (3) sum to unity, as they should The 
expected value of X t is 


£(*, ) - E n(Xf)" 


- = e~ u \t E 


(XI)"' 


' (n - I)' 


In particular, setting / = 1, we find that X represents the expected number of 
occurrences in a unit time interval, as suggested in the intuitive remarks 
which followed assumption 3 
Since 


«*,(*, — t)) = E n(n - 1) g. - ftO" 

w—O 7,1 


then 


= (X<) 2 e— X1 E 

n«2 


( xi )”- 2 

(n - 2)' 


= (xt) 2 


Var (A',) = E(Ar,(Ar, - 1)) + E(X,) - (.E(X,)f 
= (XI) 2 + XI - (XI) 2 -= XI 

The variance of a Poisson random variable is equal to its expected value 
•After Simdon D Poisson a nmctccnth-cenlury French probabilist 


654 / CONTINUOUS PROCESSES AND VARIABLES 



In the above discussion we have interpreted I 1 as a measure oi time How - 
ever, other interpretations and applications of Poisson processes are possible 
as the following examples show 


Example 1 (a) Consider a textile loom which is weaving cloth con 

tmuously and which from time to time produces a weave with a bro 'en 
fiber Suppose that we assume that a broken fiber is as likely to occur in any 
one small unit of area as another, that numbers of breakages are indepen ent 
from one area to another, and that the probability of two or more ro en 
fibers per small unit of area is negligible Then the number of bro en ers 
per t square units of cloth is a Poisson random variable 

(b) In astronomical studies, it is reasonable to assume that a given re 8* 0I J 

of space is as likely to contain stars as any other region of equa vo ' ** 

disjoint regions are independent, and that there is neghgib e pro a 1 
finding two stars in close proximity It follows that the num er o s ar 

s cubic units of space should be a Poisson random variable 

(c) Other examples of a similar nature concern the distributions of 
things as raisins in a cake, misprints in a page of type, an t e pa er ^ 
from mortar rounds fired into a given area 

Sample 2 l„ checking whether the Poisson distrtbution fits a pnrt'cular 
°F data, a common statistical procedure is to compu and 

f j°m the sample data, use this to generate a set of P°is son p , da ’, a 
fte " compare the theoretical Poisson frequencies with the ob ^ m 

A classic example, due to Bortkiewicz,* concerns t e num h 

w Prussian r ... i-.„w hnrsns The first two rows of Table 1 snow 


^ Prussian army due to kicks by horses The tlrst ™" ‘ T "he average 

“ e °bserved data gathered from the records of 200 corp 
"“mber of deaths per corps is 

x 0(109) + 1(65) 4 = 0 61 

200 


table i Total 


1 J ’ T ’ bef deaths k 

0 

1 

2 

3 

4 


ol corps with It deaths 109 

65 

22 

3 

1 

200 

Piobabilities 

n»C!!!iQ6i)t 

A 1 

0 543G 

0 3316 

oioii 

0 0206 

0 0031 


r"T 

200 Pi 

108 7 

66 3 

20 2 

4 1 

06 

199 9 

't,., 

‘‘“er Li/ 

Das Gcsctz dcr Ucmcn /obtco I r, P' ,f 

21 , POISSON PROCESS 

/ €55 



The Poisson probability function with expected value 0 61 is shown in row 3 
and is multiplied by 200 in row 4 to give Poisson approximations to the 
actual data The degree of correspondence is remarkable ► 


If X is a Poisson random variable with probability function 
P(X = k) - k = 0, 1, 2, 


the probability generating function of X is 


SxW = E(; y ) = £ : 


(*X)‘ 


t=0 
= P -X+X« 


Let Y be another Poisson random variable, independent of X, having 
probability function 

P(Y~k) = ^ i = 0,1,2, 

Then since the generating function of a sum of independent random variables 
is the product of the individual generating functions, we find 

gAr+y(s) = gx(s) gy(s ) = e -x+x« e ->i+n* 

= e -(X+*.)+(X+^)» 

We conclude that the sum of two independent Poisson random variables, 
one with expectation X and the other with expectation p, is another Poisson 
random variable with parameter X + p The next example shows how this 
same conclusion may be obtained from the basic assumptions for a Poisson 
process 


Example 3 (a) In a model of individual behavior in a choice situation, 
Audley* postulates that each of m possible overt choice responses has as- 
sociated with it an implicit response Implicit responses of each type are 
assumed to occur in a Poisson manner, independently of responses of other 
types 

Let us consider the case m = 2, writing X t and Y t for the respective 
numbers of occurrences by time t or the two types rj and of implicit 
response Then there arc parameters a = E{X x ) and 0 = E(Y{) such that 

P(X, = 0) - I — at -f 0(0 P(Y t = 0) = 1 - /?/ -f o ( 0 

•Audley R J , “A Stochastic Model for Individual Choice Behavior.” Psychological 
Rnte k 67, I- 1 5 (1960) 


65S / CONTINUOUS PROCESSES AND VARIABLES 



P(X, = 1) = at + o(t ) P(Y, = 1) = + 0(t) 

P(X, > 1) = o(f) P(Jt > 1) -» °0'> 

Since X t and Y, are independent. 

P(X, + r, = 0) = P{X, = 0, Y, = 0) = [1 - at + o(r)][l - P* + cm 

= 1 - at - fit + o(0 = 1 - (a + P)t + o(t) 

P(X, + Y, = 1) = P[X, = 0, Y, = 1] + P[X, = 1, Y, = 0] 

= [1 - at + o(t)]\pt + 0(1)] + [«1 + oM! 1 ~ 

= (31 + at + o(l) = (« + 0)t + «(') 

Similar calculations show that 


P(2f, + Y, > 1) = o(l) 

Hence the random variable Zi = X t + Pt satisfies the Poisson assumptio 
with intensity a + /3, and it follows that the probability of o taming e 
k implicit responses in the time interval (0, 1) is 


MO = P(Z, = fc) 


_ *-“>+«'(« + ft ft* 
- k i 


k = 0, 1, 2, 


problems 


hems 

1- The demand for a certain inventory item js Poison jdh^ an of 
daily demand of one unit What is the prob y 

items requested over a five-day period will be 

(a) exactly 21 (b) less than 2’ (c) at least ^ 

2- If voters arrive at the polls in a Poisson oHwoToMverlappmg 

per minute, what is the probability that in 

5-minute intervals the number of arrivals wi e i 9 

(a) exactly 21 (b) 2 or more 7 (c) 4 or ess ^ 

3. If highway accidents are Poisson distributed at a ta 

road per month, find the probability that in one m hundred- 

la) at least one accident occurs in each of four nonover 
mile stretches 


mile stretches hes 

(b) at least one accident occurs m at least one of th 

Use the procedure of Example 2 to fit a 
lowing accident data reported by Mintz an 

„ . 6 7 8 9 10 211 TW l 

0 1 2 3 4 5 6 — 2 ----- 



- ,nen 201 21 2 1 0 0 0 0 0 Concc p!.” 

• a nd Blum, M L , “A Re-examination of the Accidin 
°J spited Psychology 33, 195-211 (1949) 

/ 657 

POISSON PROCESS / 



to the occurrence of two successive implicit responses of the same kind 
For example, 

K = 0 lfriri or ^2 

V = 1 if rsrin or rir 2 r 2 , 

(a) Prove that E(V) = 3 «£/[(« + (3) 2 - and hence that £(K) 
depends only on the ratio j3 /a 

(b) Find the value of P(-Ri) for which E(V) is a maximum 

9. Show by induction that the reproductive property of the Poisson distri- 
bution extends to any number of random variables That is, P™ )ve a 
if Xu X 2 , , X n are independent Poisson random variables with 

expected values Xi, X 2 > » X„, then Z = X\ + X 2 + 

Poisson random variable with expectation Xi + X 2 + -r • * 

10 (a) Suppose that manufacturing errors form a Poisson process in which 
X, the average failure rate per hour, is unknown Moreover X vanes 
from time to time, and from past experience it is est *^ a e 
P(X = d = 1 an d p ( x = *) = PCX = 2) = * Given that we ob- 
serve two errors in a 4-hour period, find the conditiona pro 
thatX = X = l.andX = 2 

(b) A Bayesian decision rule for estimating X is one ^ 

observed number of errors, chooses that value of X which has the 
largest conditional probability Formulate a Bayesian 
for the estimation of X 

11. Consider a Poisson arrival process, such as voters ” !!° tcs ^ 

place, which has parameter X Let us suppose that eac a ‘ ‘ d 

Bernoulli trial m which success occurs if the arrival 

failure occurs if it is not Prove that the occurrences of ^ordcd^rnvm 

constitute a Poisson process with parameter m P » hich an 

alternative versions of this problem The number o . d 

-sect lays may be a Poisson random variable with paramaer X and 
P is the probability that an individual egg pro uces * . q 

The number of mature insects is then Poisson with ^ 

the number of particles hitting a Geiger co ‘‘" lcr ^ %, c nu mbcr of 
constant probability that a particle activates the count 
recorded particles is Poisson, and so forth ) 

Suppose that a new process is created from the oneina p°™l * 
only alternate arrivals Show that thts process is no 

An inspector monitors the automatic production 1 of P ^c Cur as 
looking for tolerance deviations and casting irregu Iinc j |ue 

independent Poisson processes at the respective rat probability 
per hour Suppose that tolerance devi itions are c t 
bfcand casting irregularities with probability 0 
(a) W'haj t he probability th it in 2 hours 
(0 no tolerance or casting defects occur 
( 11 ) evict!) one defect (of either kind) occurs 
Cm) at most three defects occur ^ 


13 


21.1 POISSON PROCESS 


/ es. 



5 The Homemade Cookie Company has noted that the number of chips 
* m on e of their Chocolate Chip Delights is a Poisson random variable 

and that the average number of chips per cookie is 2 5 

(a) What is the probability that a cookie contains at least one chip 9 

(b) What is the variance of the number of chips per cookie 9 

(c) Determine the probability that a box containing M cookies contains 
exactly the same number of chips as a separate box of N cookies 
{M and N are positive integers ) 

(d) Due to customer complaints, the inspectors have been instructed to 
dispose of all cookies containing no chips What is the expected value 
and variance of the number of chips per cookie for the remaining 
cookies 9 


6 (a) In a Poisson process show that for s < t, 


P{X, - k | X t = n) -- 



Given n occurrences in time /, the conditional distribution of occur- 
rences in time s < t is binomial 

(b) Argue that the result in (a) is equivalent to the following Let X and 
Y be independent Poisson random variables with expected values X 
and fi, respectively, and let Z = X + Y Then 


P(X = k\Z 


-otby ter 


(c) If X t and Y t denote the respective numbers of occurrences in two 
independent Poisson processes and if E(X,) = Xf and E{Y t ) =* fit, 
then Z, = X t + Y, is a Poisson random variable with E(Z,) =* 
(X 4- fi)t The conditional probability function of Xt, given Z t = n, is 


P(A'« 


A | Zt = n) = 


©tb)W' 


7. (a) Continuing Example 3, prove that the successive occurrences of 
implicit responses r i and rz form Bernoulli trials with p = P(r l) “ 
ct/{a + 0) and q = 1 - p = p(r 2 ) = 0/(a + /3) {Hint Apply 
Problem 6(c) ) 


(b) Audley’s model* assumes that an overt response of type R\ occurs 
only after two successive occurrences of the implicit response ri 
Examples of such sequences are nri, r 2 ^iri, ri^^iri, , with 
respective probabilities p~, qp~, qp ■*, , where p and q are defined 

m (a) Show that the probability P(/?i) that an overt response is R i 
is gi\cn by 

HR,) - — + 2 w 

<« + »[(« + er- - 


8. In Problem 7, let V be the number of occurrences of implicit responses 
of cither type prior to the occurrence of an overt response, that is, prior 


'Audley R J , “A Stochastic Model for Individual Choice Behavior,*’ Psychological 
Rale k 67, 1-15(1960) 


658 / CONTINUOUS PROCESSES AND VARIABLES 





(b) What is the probability that in 2 hours the inspector detects 

( l ) no tolerance or casting defects 9 
(u) exactly one defect 9 

( 111 ) at most, three defects 9 

(c) Suppose that the inspector wishes to take a 15 minute break What is 
the probability that 

(i) no more than two defects will occur 9 

(») no more than two defects would have been missed had he been 
present 9 

(d) Given that six defects have occurred in a half-hour period, what is 
the probability that 

(t) at least one casting irregularity occurred 9 
(n) exactly one casting irregularity occurred 9 

(m) at most two casting irregularities occurred 9 

(e) Let Z be the number of tolerance errors in a half-hour period Find 
£(Z) and Var (Z) 

(f) What is the probability that at least 60% of the casting irregularities 
are detected, given that six casting irregularities occur in a half-hour 
period 9 

(g) Given that six defects occur during an hour interval, what is the 
probability that exactly three of these defects occur in the first 
half-hour period and two of the three are detected 9 


21.2 CONTINUOUS TIME RANDOM PROCESSES 

In the Poisson process, the parameter X represents the expected number of 
occurrences in a unit interval of time of the phenomenon under study As 
such, X provides a measure of the rate of increase of this number of occur- 
rences This rate remains constant and is unaffected by the number of 
occurrences or the times of their occurrence 

In the study of population growth, such assumptions are often not realistic, 
the rate of increase naturally depending upon population size Moreover, 
populations may decrease as well as increase We are thus led to generalize 
the approach taken m the Poisson process in the manner indicated in Figure 
2 When population size is n, there are opposing tendencies for subsequent 
increase or decrease of the size of the population The rates X n , of increase, 
and n„ , of decrease, may both depend on n The Poisson process is obtained 
b> choosing = X and p n = 0 for all n 

A useful special case (Figure 3) is obtained by choosing X„ = «X + a and 
pn - np + 0, where a, 0, X, and p are constants The rationale behind the 
choices is as follows Consider a population subject to rates «, of immigra- 
tion and 0 of emigration, which do not depend on population size In 
addition, for each individual, there is a fixed birth rate X and death rate p 
Taken together, these assumptions imply that when population size is it, the 


660 / CONTINUOUS PROCESSES AND VARIABLES 



probability 1 - ji/r + o(/t)). Expanding the binomial expression and noting 
thattenm: involving /t 2 , A , ... are such that we obtain the limit zero wtar 
we divide by /. and let h - 0 (that is, these terms are all o(h)\ we find 
the only term which is not o(/i) is npli, as indicated. 

Our process is thus described by taking X. = 0 and = W >n (4) and 
(5). Thus, we have 

p((t) = — nppji) + (« + V)pPn+i(f) " - 1 

Po(0 = PPi(0 

These equations may be solved successively. If when r = 0 1 e size o 
population is k (so that p n (t) = 0 for n > k), we ave 

Pk(J) = —kppdf) 
or 

(D + kp)pi(<) = 0 

Multiplying both sides by e 1 " 1 gives (see Theorem 3, Section ) 
C[e*'W/)l = 0 

from which 

pdf) = C^‘ 

where the constant c*. = 1 since p t (0) = 1- 
With pi(t) known we next solve for pi— 1(0 by wn 

p£_i(0 = -(fc - + 

or 

r —Aw* 

[D + (k - l)rfpr-i(') = tfiC 

Multiplying by the integrating factor e>->" yM* (« Section 15.2) 

D[e tt - ,> '“pi-i(')] ” " 

% hich may be solved to give 

Pi -l(0 = Ac-“-»"0 - 

Proceeding by induction, we find for 0 < H 

_ .. . . .-,* jsjotc that as / increases. 

a binomial probability function with p - c • - • * 


' » lur ii * 

Imputation eventually dies out. 


lull ■ . 

- »u>umiai probability luncuon »»•« r .. nnc would expect, tlic 

Ml) goes to zero for n > 1. while pdO - >• As on 



the sound of a tone 

'triable exit gates. Tlic parameter p » 3S estimate ^^jor. 
mn dcl found to provide a good approximation to ac • 

^tterjty. w„ “ A Stochastic Model of Escape llehaxmr. • ’ ’ 

Unitmny, t%9 (unpublished). 

„ ...... ' 


A typical example or such an extinction “ P ^„“a“shocl. box "»»<» 

• lice ucre conditioned to exit at the sound 0 a r fom ,|, c data and the 
'triable exit gates. Tlic parameter p " ;,s ,n .. ...-,1 helms ior. ^ 



so that the event (Xf+fc — n) partitions into 

(*,+» = n) = = k, X l+h - X t =n-k] 

k 0 

Again writing p„(t) for the probability P(X , = ri), we have 

p„(‘ + *) = £ PA 0 P»-iVt) 

0 

Using assumptions (a)-(d) above, this reduces to 

p n (t + h) = p„(r)[l - \nh “ + Pn-lW^n-l* + /Vi+lOW+l** + o(H) 

Subtracting />„(/), dividing by //, and letting /i — > 0 yields the differential 
equation 

phi 0 = — (X n + Mn)/>n(0 + An-lPn-l(0 + Mn+l/>n+l(0 ^ 

for n > 1 When n = 0, we write = 0 and i (/) ~ 0 to reflect the fact 

that population size is never negative, obtaining 

Po(0 = -AoPo(0 + Pipi(t) (5) 

Equations (4) and (5) are the general differential equations for birth and 
death processes As an intuitive aid to understanding these equations, we 
may view X n and n n as measures of the tendency of the process to leave state 
(population size) n, moving either upward or downward The rate at which 
these transitions occur is proportional to the probability p n (0 of being in 
state n Similarly, we enter state n , or tend to increase the probability of being 
in this state, either by being in state n — 1 (probability />„-i(0) and moving 
up (rate A„_i) or by being in state n + 1 (probability p n +i0)) and moving 
down (rate p n+ i) 

Example 1 A “ Pure Death" oi “ Extinction ” Process Imagine a population 
whose members can only die (lea\e the population) Let us assume that 
members act independently and that during any short interval of time It 
each member has probability nh + o(h ) of dying Then, of n members, the 
probability that none die in time h is 

[1 — /ih + o(h)] n = 1 — n^li + o(h) 
the probability that exactly one dies is 

n[\ - iih + o(/0f- V, + o(/i)] = nnh + o{h) 
and the probability that two or more die is 

1 — [1 — iifili + o(/j)] — [n/xh + o(/i)] = o(//) 

To illustrate the calculations, we obtain the second of these equations by 
noting that the it members act independently and that any one of them may 
die (this has probability nh + o(/i)), while the others must live (each with 


662 / CONTINUOUS PROCESSES AND VARIABLES 



Solving recursively we find 


(X 4- v)ai — Xgp . 


and, by induction, 


= © fl0 

for all i! > 0 , . „ , ^ „ r n 

The senes £Lp«» = «,&.(*)’ converges only when X < p 

this case, since = 1, we have 

, l 

a ° = 1 ~ M 

and the limiting distribution is geometric with 

N , v 1 * n di\ ernes In this 

If X > H, we must take a 0 = 0 since otherwise o 

^se, we have 

fln = 0 for all n 
and the waiting line must exceed all bounds as t 

Example 4* Consider the process by which a group of If P^P^ (hat ]f at 

•heir preferences for two opposing political candi , (he interval 

t,mc t a person favors candidate A, the pro a 1 1 while if he favors 
t 4- k\ lip switrhf*<; snnnort to candidate B \s fi t ’ . l.„ , n «tatc 


t,mc 1 a person favors candidate A, the probability while if he favors 
t + h) he switches support to candidate B is \x i ’ j to be in state 

the probability of switching is X// + o(h) The group is said 

if n members support candidate A a birth and death 

Assuming group members act independent > " e 

Process with 

for 0 < w — A r 

X n = (N - n)\ and ,or - 

basic system of differential equations is 

//„(<) = -A'k/’nfO + PP' (0 

rib) = -[(A' _ n)\ + nripM + (*’ ~ " + < „ < K ~ ' 

+ (,, + lV/v+.W " hC - 

S ' Jr '<l) - -A’ppvO) + Xpv_.«) ( ^ ( , [rt hr,, 

hon Coleman, J S . Iirmtxwn « time slc d 

m. 1 %!) „ ,Vm A of com. nuo 

' 11 1 t\x.l 


^ panoom rxoctsst! / •“ 



Example 2 A “ Pure Birth" Process In contrast to the death process of 
the preceding example, consider a population whose members create new 
members but do not die If we assume that members act independently, each 
at rate X, the process is described by (4) and (5) with X„ = n\ and p n — 0 
It is easily verified that if n 0 is the population size at time t = 0, then for 
n > n 0 , 

Pn(t) = \ V _Xfn °(l - e- x< )” -n ° 

\n — « o/ 

which for each t is a negative binomial probability function with 

p = e" xi ► 

Although we shall not attempt a proof here, it can be shown that m any 
birth and death process, the limits 

lim p n (t) — o n 

exist and do not depend on the initial conditions under which the process 
began The a n may be determined from the explicit form of the solution (as 
in Example 1), but are more often found by solving the equations obtained 
from (4) and (5) by replacing the derivatives on the left by zero 

Example 3 Consider a system which processes randomly arriving inputs 
on a first -come, first-served basis If the processing unit is busy, the new inputs 
form a waiting line or queue Typical examples of queueing systems are the 
flow of aircraft in and out of an airport, the flow of cars at a toll booth, 
arrivals of injured at the emergency room of a hospital, or the arrival of 
customers at a restaurant or supermarket 
In the simplest case the system contains a single service unit Inputs are 
processed individually and remain in the waiting line until processed Let us 
assume that inputs arrive according to a Poisson process with rate X and that 
individuals depart at rate n The system is thus a birth and death process 
with \ n = X for all n, while n„ = M for n > 1 and Mo = 0 

Although explicit solutions may be obtained,* these are rather complicated 
and will not be considered here On the other hand, limiting values are ob- 
tained fairly easily Replacing the derivatives by zero in (4) and (5), we obtain 

0 = -(X -f n)a n + Xa„_i + na n+l 
for n > 1, while for n = 0, 


0 — — Xao + pa\ 


,Scc . D R ’ and S[nsth * w L • Queues (John Wiley & Sons, Inc , New York, 1961). 

pp CO-64 


664 / CONTINUOUS PROCESSES AND VARIABLES 



X2/>2« 


1 Consider a “pure b.rth” process and let X, denote the number of “births" 
4 - " Assume that when X, ,s odd the probabdityofabuth tn.-ne 

interval (t, r + A) .S X,/t + «(/>), wh.le tf X, is even, the probabihty 
X 2 /i + o(/t) Take Xo = 0 Find the probabilities 

Pi(t) = F(X, is odd) and P 2 W = p ( x < 1S even ^ 

(Hint Derive the differential equations 

rt(r) - -Xipi(r) + X2P2M and “ Xipi(0 ' 
and solve them ) 

5. The follow, ng is a “pure btrth" process w.th .“^SsumedtoTrecorf 
an application of stochastic models to socl ’ .. contagl0 n” process 

purchases of teen-age girls could be represen . | ll>ely she wou ld 

in that the more records a girl had purchased, assumed that 

be, in a given interval of time, ° f 

if by time 1 a girl had purchased n rec ^ parameter /3 is 

a new purchase in (t, / + h) was (a + nfi)h + 

called a contagion parameter an 

(a) Write the differential equations which 8 °^ rn ^ C p^ess starts in 
induct, ve argument to verify that, assuming the proce 

state Eo, the solution is 

O+sOllr/i _ n"ot« + 0 ] 

Pn«) = £ 


nt/3" 

(b) Prove that J^n^oPnO) - 1 ^ or al1 1 f r ri x )~ r ) 

(Hint Make use of the senes expansion for (1 

.. * - n and the system starts in state i 

6. Rework Problem 5 assuming that a ‘ continuous 

7. A social movement gains and loses followers a 0 j- an infinite 

time “contagion” process Specifica then the probibility 

population are followers of the movemen * follower is 

that m the interval (/, t + h) the movement gams 

(X + na)h + °( tl ) 

Followers of the movement act 
where a is a “contagion” parameter j , r onc drops out is 

independently and the probability 1 a a limiting probability «■ •k' 1 * 

n + m (That is, „.-(>») ^ of lhcmoumcnt 

exactly n members or the population ore ^ ^ ^ nU intom order 

8- Parsons and Shilst postulate that social Km stales Eo, Ei* 

in their internal structure Assume a seq disonJcr ,\ssuntc tint > r ' 

C 3 ,. .arranged according to their degree . of moving to 

s>s.em ,s in state E.(n > 1) »> ««« '• ,h ' ] ,< „h -> ..(M.-ndof 
fain (r, r + A) ts X/i -1- <’( *)■ of m0 ' 1 { l f ncuc. Ill . I** 1 ' 

J s , I„, m , rMm to Mathrmated Snadoo (I ree t res 
.r 101 s of A rivnttsnmdUnnen.i 

T, and Sluts. E,. t ds , To a Cftr.it T tot) 

'■ Txri'nJfc. Mass . 1931) . - 


« -" OCt5StS ' " 



Replacing all derivatives by zero and solving, we obtain 


ai = *g)o 

_ N(N- 




and by induction 




for 0 < n < N Of course, a n = 0 for n > N Imposing the restriction that 

N N 

i - E * - E 




■o+sr 


we find fl 0 = [m/(m + and in general 


Q n 


oc-iiTter 


0 < n < N 


The limiting probabilities of being in the various states follow a binomial 
probability function ^ 


PROBLEMS 

1 Consider the acquisition and loss of friends as a continuous time random 
process in which new friends are acquired in accord with a Poisson process 
with parameter X Friendship durations are assumed to be mutually 
independent the probability of losing a particular friend during a time 
interval of length h being nh -f o(/i) The system is said to be in state 
E n if a person has n friends Find the limiting probabilities of being in 
the various states 

2 In Problem 1, given that at time zero a person has no friends, verify that 
the probability that he has n friends at time / is 

*■(*) “ “T ■ (" = 0 1,2,3, ) 

where a - (X/m)(1 - e~>“) 

3 Let us suppose that successive automobile purchases by a particular 
family constitute a continuous time process in which the probability 
that car A is purchased during a time interval of length h is X/i + o(/») 
and the probability that a different make is purchased is fih + o{h) 
The s)stcm is in state £»(n > 1) if the family has purchased car A n 
times m succession and in state E 0 if ihe last purchase was of a different 
make Find the limiting probabilities of being in the various states 

666 / CONTINUOUS PROCESSES AND VARIABLES 



(a) Argue that the system is a birth and death process with 
X„ — A and n < M 

= Mfi n> M 


(b) Find the limiting probability that there is at least one element in the 
system 

(c) Find the limiting probability that there are elements in the system 

(d) Under what conditions on X and /i does a limiting probabi lty unc ion 

exist 9 . 

(e) Find the probability that an input has to wait before being processe 

12. Suppose that inputs in Problem 11 that are not p ™^““ d d '”™ u d tlon ,s 
leave and do not return Show that the limiting probability distributio 


(\/n)"/n' = o, 1, 2, , M) 

E’.U (X /#07«' 


21.3 CONTINUOUS RANDOM VARIABLES 

Our discussion thus far has been concerned with ^"juch a random 

(See Section .9 1 ) The graph of the being 

variable consists entirely of jumps and level s r ^ ^ th ose ran( jom 

concentrated at the jump points At the other ex and (hus are 

variables whose distribution functions have no J 111 " con tinuous 
continuous Such random variables are themse ves c ^ t j )e ex p 0n enual 
A particularly important continuous random vari , er the time j 

w hich arises in connection with the Poisson proce it Tis greater than 
that one must wait for the first occurrence It is o ™ b t]me t Thus, 

'he real number r if and only if there are n0 oc j; _ t ]ie sa me event, 
us mg the notation of Section 21 1, (r > an ' 1 
and we have —u 

P(T > t) = P(X t - °) ” e 

c of T is the continuous function 

For > > 0, then, the distribution function of T 

-M (6) 

F T (t ) = P(T < 0 = 1 _ e 

^ J vhen / < 0 The quantity 

w course, since T cannot be negative, ErW^Owen ^ ^ form o( - lts 

F « called an exponential random variable 

‘stribution function (6) 

Example 1 Voters arrive at the polls m a , ^'^"minutcs until the first 
'“ 0 Per minute If T denotes the waiting tim 
a mval, ti, cn for , > 0j the distribution function 


P T U) = P(T < 0 = 1 


21 3 


v a n i A n 


random 


LES / c6 ’ 



moving to any other state is o(h) For n = 0, these probabilities apply, 
respectively, to remaining in state £o or moving to Et Prove that if 
H > X, the social system in the long run has greater probability of being 
in state £o, the state of least disorder, than m any other state 

9 In Example 4, assume that group members are influenced by the number 
of persons who support the opposing candidate Specifically, assume that 
if at time t a person supports candidate A and there are N — n persons 
who support B, then the probability that in the interval ( t , t + It) he 
switches his support to B is [jx 4* (N — n)a]h + o(/z) Under the same 
circumstances, the probability that a person favoring B switches to A is 
[X 4* na]h *f o(/i) Group members act independently 

(a) Write the differential equations governing the process 

(b) Show that the limiting probabilities satisfy 

N\ 

ai =■ — T7*r — 

H+ (N — l)a 

N(N - 1 )X(X + a) 

02 " 2 UK + (N - 1 )a]in +(N- 2)a] " 0 


and, by induction, that 

/V 


0 


X(X 4* a)(X + 2a) (X 4- (n — 1)«) 


11 [m 4- (.N - l)a][ M 4* (N - 2)a] [n + (N - «)«! 

(c) Using S" d0 a n - 1, show that 


n— 1 

no.+A.) 

J-0_ 


Ilft + f + rt 

j-0 

and thus verify that the limiting probability a n of finding the system 
in state is the same (if X, m, and a are integers) as the probability 
of drawing n black, balls in N draws from a Polya urn which initially 
contains X blacks and y. reds and to which a balls are added following 
each draw 

10. The operator of a car washing establishment finds that customers arrive 
at an average rate of one every 5 minutes, and that service takes, on the 
average, 3 minutes Assume that arrivals follow a Poisson process and 
that services are completed at rate^- 

(a) What is the probability that a customer will not have to wait 7 (Hint 
Proceed as in Example 3 and compute this as the probability the 
waiting line is empty ) 

(b) Given there is a waiting line, what is the expected waiting time 7 

11. Assume that the system of Example 3 has M identical processors If all 
processors are busy, an incoming element joins a common waiting line 
and waits until a processor is free 


668 / CONTINUOUS PROCESSES AND VARIABLES 



FIGURE 5 (a) Density function of a uniform variable U. 
(b) Distribution function of U 



(a) 



( r(o+jO _ x f 

f x (x) = |r(a) r(» 


Example 3 A random variable X having the density function 

0 < * < 1 

. elsewhere 

is called a beta variable The r-function is defined as 

r(a+ 1) = / 0 e ' ,,adt 

For integer values of a, r(« + !)-«• The density "a =7='^ 
elers a and 0 which must both be greater than zero 
beta variable becomes the uniform variable 
The distribution function, often called the mcomp e 

x < u 

FvW = I f TCrr+ g) ,«->(] - * 0 < * < ' 

I Jo r(«)r(d) X >1 

b ► 

an d has been extensively tabulated * 

For any random variable X, if b > a ’ we Tythen F\(b) — F\(o) 
$ a) In terms of the distribution function o ^ ’follows that the 

whenever b > a , so that f\ is an increasing tu the relation 

density function f\ = F x must be non-negative 


Fx(x) 


yields 


= j f\ (0 c 


•c Umve«iiy 


(,) / A(x)r/x=l 

, n K Ed (Cambridge E 

Fables of the Incomplete Beta Function, Pears 

5 Cambridge, England, 1934) , ^ 


The probability that at least 3 minutes elapse before an arrival is 1 - F r (3) - 
c“° « 00025, while the probability that no more than 15 seconds are 
required is fr(i) — \ — e 11 ~ ~ 0 393 ^ 


The derivative of the distribution function of a random variable X is 
called its density function * For the exponential case (6) the density function is 


MO = FKO = 



when t < 0 
when f > 0 


(See Figure 4 ) The distribution function Ft does not have a derivative at 
t = 0, but if we extend the definition of fr by writing fr( 0) = 0, then it 
follows that for all / ( 

fV<0 = /_ hix) dx 


FIGURE 4 (a) Distribution function of T 

(b) Density function of T 




Example 2 A random variable U having the density function shown in 
Figure 5 is called a uniform or rectangular random variable Here the density 
function is . 

[0 if / < 0 or / > 1 


M0 = 


when 0 < t < 1 


The corresponding distribution function is 

{ 0 for / < 0 

t for 0 < / < 1 
/ 1 for > 1 

(Sec Example 4 of Section 19 1 ) 


’Allhough wc shall not encounter them in this text it is possible (o find distribution 
functions which have no derivatives Thus in advanced books it is common to distinguish 
tviuccn a continuous random variable X, one which has a continuous distribution 
function and an absolutely continuous random variable Y, one for which there exists 
a function A such that the distribution function can be written as 

For our purposes the role of A will be p!a>cd by the derivative of Fy 


670 / CONTINUOUS PROCESSES AND VARIABLES 



The expected value of a continuous random variable Xwith density function 
fx is defined by the improper integral 

E(X) = j_jM 0 * <7) 

and the expectation of a function h of X is defined by 

E(h(X,) = KOMO dt (8) 


Example 5 The decision as to what stock inventory eve o 
often determined by the demand anticipated against t e ,nv ™ v „ na hle 
simple example, let us assume that demand is a continuous ran a 

D > 0 with density function/ Let 7 denote ; the invent. °^ ev ^ a per umt 
unit overage cost c x when supply exceeds dema C s the cost 

shortage cost c 2 when demand exceeds supply ? > 
function C given by 

fc,(7 - D) When I > D 
C ~ W(7) - 7) when D > 7 


is a random variable with expected value 

E(C) = c, f (7 - 0/(0 dt + c 2 J 0 ~ 0/(0 dt 

Jo 

= cjj o f(f)dt -c,J o 1/(0 dt 

+ 4 tf(t)dt-C2lj I /(0 d ‘ 

E(C) is a function of 7, and we may minimize expected cost by d 
the value / 0 of 1 for which 

ji £(C) = 0 

Recalling that 

d f f(t)d t = /(*) 

dX J a 

and voting 

m- fcf®* 

0r the distribution function of D, we find 

j t £(C) = c,//(7) + c t F(I) - ~ C ' 1 ^ ) 

_ c, + c 2 F(I) + ^-l/O) , _ 



(2) J fx(x) dx = j f x (x)dx - j_J x (x)dx = F(6) - F(a ) 

— p(a < X < b) for a < b 

Thus we see that the total area is unity under the curve representing fx, and 
that the area under the curve between the values a and b represents the 
probability that X takes on a value in the interval (a, b] (See Figure 6 ) 

FIGURE 6 



a 

Hi 

p area =P(a<X< b) 


i - * 


Example 4 Suppose the weekly garbage pickup G (in tons) for a certain 
city has the density function 


/(*) = 


j*[-x 1 2 * + 11 Ox - 1000] 

to 


10 < x < 100 
otherwise 


where k is a constant such that 


!_/(*) dx = 1 

/ " rio rioo 

_y(*) dx = j^ 0dx + ) iQ H-x- 


We compute k by writing 


+ 110x — 1000] dx + 


/ Oc 
J 100 


I = 0+ k 121,500 + 0 

Hence, 

A = 1/121,500 

and/U) - (1/121,500)(— + UOx - 1000) 

If the city can handle 40 tons without overtime, then the probability that 
normal operations will sudlce is 

1 f 4 ° 

P(10 < x < 40) — 121^500 J io (~x 7 + IlOx — 1000) dx 

*= 7/27 ► 


672 / CONTINUOUS PROCESSES AND VARIABLES 


Example 6 The moment generating 
variable T is 


function of an exponential random 


m T {s) = 


/: 


= j e’ l f T 0)dt 

X — (X— S)t 


\e~ u e st dt = - 


X — s 


s — X 
for s < X 


. o 


Differentiating, we find 


E{T) = mHO) = M> ~ s ~> 2 |,_ 0 " X 

J _ _2 

E(T 2 ) = mf{ 0) = 2X(X - *) | i=0 X 2 


Hence 


2 2 J_ = _L 

Var (T) = E(T 2 ) - [£CO! = ~ X 2 X 2 


The value E(T) = 1/X might have been anticipated manintuitiveft^ ^ 

« follows The quantity X represents the average mi ^ therc arc x 

un 't time in the underlying Poisson process If, , , , , he average 

occurrences in one unit of time, then it is reasonable o 
length of time between occurrences is 1/X 

•a working with discrete random variables the relation ‘ ^ | lcrc to 

derived from (7) as a theorem We have used t ic a ‘ ^ theorem 
Amplify details, essentially accepting without proof 

Theorem 1 ir x is a continuous random variable 'V'th ^ cn ^ n [(y) 1S 
x an d T = h(X) is also continuous with density u 
f'm b> either 


E(Y) ■ 


L 


1 («) d" 


orbv 


r r 

£ ( y) = JjMfxOd: 

'^h ’e Theorem 1 serves m most cases to climin He ^ tv- fo'.n'd T1 - 

_! '■ alrcidy Invc / v . there arc limes when M 
' ''"“’■•re is h-st illustrated through examples 



Thus the optimum inventory level is the value Jo for which 


F(/ 0 ) - 


C2 

Cl + C 2 


or, equivalently, for which 

F(/ 0 ) _ c 2 

1 - F(/ 0 ) Ci 

In order to minimize expected cost the inventory level should be established 
at the point where the ratio of the probability of an overage to the probability 
of a shortage equals the ratio of shortage cost to overage cost ► 

The variance of a continuous random variable X is defined irv the same way 
as that of a discrete variable That is, 

VarW = F([X-F(*)] 2 ) 

The factorial moment generating function is 

gx(s) = E(s x ) 

and the moment generating function is 

m.v(j) = £(e ,i ) 

As with discrete random variables, means and variances of continuous 
random variables can be found by differentiating the generating functions 
For instance, using the moment generating function 

m Y (s) = £(e ,Y ) = f e"MOdt 

the derivative is 

'n'vfj) s= J te' l fx( I) dt 

(Here we assume without proof that the order or the operations of dtfTeren- 
tuition and integration can be interchanged in a manner similar to the 
differentiation of series expressed m Theorem 4 of Section 16 2) Setting 
5 = 0 gl\cs 

W\(0) = j_JfxV) dt = E(X) 

Similar calculations show that for any k, the Ath derivative of m\, evaluated 
at zero, is£(A*) 


674 / CONTINUOUS PROCESSES AND VARIABLES 



occur in the time F between ^ and the next excitation E 0 Following E t , the 
waiting time W to the next response is assumed to be an exponential random 

variable with parameter X , , „ , , 

Primary interest centers on the random variable S da noting the time 
between a response and the immediately preceding exci a ion 
takes values only between zero and r, and the event (S s) occur 
between kr and kr + s for some value of k Thus for — s — Tj 


P(S 


< s ) = £ P(kT <W<kT + s ) 


= £ (1 - c- Xs )c' XTfc 

fc =0 

, — Xs 

1 — e 


1 — e' 


(9) ► 


PROBLEMS 

1. Subjects arrive at an experimental lab at the average rate pro b- 

in accord with a Poisson process Suppose that T 

ability 0 20 of being selected to p ^ l ' C ^ a ‘ lt ‘ able experimental subject 

denote the waiting time for the arrival of a 
Find the mean and variance of T and P( — 

2. Let X be a random variable with density f\ given by 

when 0 < t < 2 
w otherwise 

(a) Find the number m such that X is equally hhe > L ahic of X ) 

less than,,, (Note such a number is called the mcm 

(b) Find a number c such that P(X > c) 

3- Let the density function of the random variable 


MO = U 

lo 


MO 


( ire 
0 


r > 0 
/ < 0 


Let ihc events A i, A A t, A i be defined b> 

,fl = O < °> 

A 2 - IT > °> 

, 1 , = (0 < 7 < 2 ) 

,„-(-l<7<0) 


...» VAn.AOUS / 



Example 7 Suppose X has the density 

b te~ t2 when t > 0 
|o when t < 0 

and suppose we require the density of Y = X 2 Then for z > 0, 
P(Y < z) = P(X 2 < z) = P(X < z 1/2 ) 

In terms of distribution functions this says 

F y (z) = F x (z 112 ) 

Using the Chain Rule (Section 12 4) we obtain, for z > 0, 
fr(z) = F'yiz) = ^F x (z' 12 ) 

= h(z m ) hz~ W 

= 2z 


= e 

for the density of Y Of course, f\ (z) = 0 when z < 0 


► 


Example 8 Many behavioral systems which generate responses have an 
underlying periodic or cyclic component * Typical examples are heartbeat, 
periodic variations in population size, and cycles of temperature, capillary 
level, metabolism, and ovarian activity 

One model proposed by McGillt postulates an underlying process which 
produces an excitation every r units of time, where r > 0 is a constant 
From time to time, these excitations elicit responses from the system It is 
assumed that at most, one response may occur between two successive 
ratfAtaiQM vnA Vne Kvmt frwr. Wa ftivt foViu-WYng -a itspcwat to 

the next response is an exponential random variable W 

Figure 7 gives a picture of the process Here R j and R 2 denote successive 
responses, T being the total waiting time between them No response can 

FIGURE 7 


IF 



Eq T E) T • • • Ek T £*+-1 

•For an interesting account of many such phenomena, see Journal of Cycle Research, 
published by the Foundation for the Study of Cycles, New York 

t McGill \V J , “Random Fluctuation of Response Rate,” Psychometnka 27, 3-17 (1962) 


676 / CONTINUOUS PROCESSES AND VARIABLES 



10. Let us assume that mortar rounds fall on a plane m accord with a Poisson 
distribution at an average rate of X per unit area* Let D denote the 
distance from an individual hit H to its nearest neighbor Prove that 
ttXZ) 1 2 * * * * * * is an exponential random variable with expected value ( in 
The distance D exceeds r if and only if there is no other hit in the circle 
of radius r centered at the hit H ) 

11. In Problem 10, find the density function of the distance to the -nearest 
neighbor ,n a Poisson distribution of points in three-dimensional space 

12. In Example 3 of Section 21 1, suppose implicit responses r j i and I re occur 
at the respective rates of five and one per minute What is the probab y 
that the latency period exceeds 3 minutes 9 

13. The following indicates the lack oj memory of the exponential distribu 
hon Prove that if T is an exponential random variable the 

p(T < t + s\T > s) = W — ^ 

Intuitively, if the waiting time T is exponential,' then having^ (ime 

minutes, one has no assurances whatever a . . r ,hution as the 

In fact, the remaining time has the same probability distribution 

original waiting time t 

14. Fill in the details of derivation of Equation (9) by showi g 
P{Kt < IP < kr + s) = - e X,) 

IS- (a) Show that the moment generating function of the rando 
S in Example 8 is 

os x Lr_£— — 

ms(ff) = E(e ) - g i - e~ > " 

(b) Use (a) to find the mean and variance of S 
,6 ’ In Example 8, show that the density function of the rondo! 


MO = 


Xe 

1 


0 < t < r 


1 e C , . nf I' by noting that all S 

{Him First find the distribution function o asdoa ii I'random 

random variables have the same distribution tin ,, ]S s I ) 

variables, and that the sum of one Sand t enev in the 

17 - 'n Example 3 or Section 21 2 show that at equilibrium 

l,m '° ,u o.stcm (including the one 

(a) the expected number of customers in c 

beinc nrocessed) is X/(m — 



Find the probability of the following events 

(a) A (b) A U A 2 (c) As n A* 

(d) Az H A\ (e) ( A\ U A 3 )' (f) {A 3 Cl Ai ) 0 (A\ U As) 

{Hint Um I 2 e -2 * = 0) 

4 Let AT be a random variable with uniform density 

, , x — for a < x < /3 

fx{x) - |j8 — a 

lo otherwise 

(a) Find the mean and standard deviation of X 

(b) Find the mean and standard deviation of Y = aX + 6 in terms of 
those for X (where a and b are constants) 

(c) Find the density function of Z = —In A' when a = 0 and = 1 

5. Let AT be a continuous random variable with distribution function Fx 
Define the random variable Y by Y = Fa (AO Show that Y is uniform 
on the interval [0, 1] 

6 Let Z be a continuous random variable with density 

( fro* when 0 < x < 2 

0 otherwise 


(a) Determine fro 

(b) Find the expected value of 2 Z 2 + 1 

7. Let A" be a beta variable (Example 3) Show that E{ X) = a/ (a + P) and 
Var ( X) = a&/{a + /3) 2 (a + 0 + 1) {Hint Calculate the moments 
directly using the relation T(a + 1) = ar(a) ) 

8 Find the optimum inventory stock level in Example 5 when 0 = $2, 
c 2 = S3, and demand is an exponential variable with expected value 3 

9. Suppose that on a certain political issue the position P preferred by 
individual voters is a random variable having the density function 

fp{x) = ~e X * T — co < x < 

where X > 0 and m are constants Denote a candidate’s position by 6 and 
suppose that if the candidate’s position does not coincide with that of an 
individual voter, he suffers a loss 

L - (P — 0)2 

If the candidate wishes to minimize his expected loss, what platform 
position should he adopt 9 ( Hint Write (P — 0) 2 = [P — E{P) + 
E(P) - 0] 2 ) 


678 / CONTINUOUS PROCESSES AND VARIABLES 



FIGURE 8 



• V / JI ^ _ i X Jl) 2 /2ff' Jy 

m. v ( S ) = £( e )=J_ e 

By multiplying out the term — (x — jr) 2 /2ff 2 , combining all tenns ' ® 
and then completing the square in the resulting quadra ic, 

gWg-(*—ji)2/2ff2 u~ . n .,n4«« on oc 


2 may be rewritten as 


Hence 


m x (s) 


«p+«V/2 


2 /-»<rVS e ' 


. w +« ! »ir/2« ! 


^ integral has the same form as (11) with constants (j* + * generating 
te or p a„d ff . It follows that its value is 1, and the moment fc 

function 


n of Jif is 


Diffi 


m.v(s) 


( 12 ) 


fercntiating m v gives 

E(X) = m5r(0) = (e*- + ' ,, ' = ' = )(h + ^Lo = M 
£ (* 2 ) = m(\'(0) = (e'n +ff ** 2/2 )(tr 2 + [h + <rs l' ) l 


c 2 + 


^ that 


Var (X) = £(AT 2 ) - [EWf = ("* + ^ ^ 

Parameters p and <r in (10) thus represent the mean and st 
n of the corresnondinc random variable. 


ii *«i>uiiuuiK rauuuiu »«»•** — , variable A is normal 

"““forth «c shall indicate the fact that a randon ^ 

‘ cx Pected value p and variance o", by wntmS random '» rl ' lb c 

^ \ llQ ndardizcd random variable corrcspon in ^ 

Vln 5 mean p and \ariancc tx 2 is 

X - n 
Z = — — 


21.4 NORM* 


L D ,sTR.nbT.o» / «« 


(b) the expected number of customers waiting and not being processed 
is X 2 /[m(m — X)1 

(c) the expected service time is 1 /m 

(d) the expected waiting time (prior to processing) for a customer is 
X/[m(m - X)] 

(e) the expected time a customer spends in the system is 1 /(m — X) 

18. In Problem 11 m Section 21 2 find, in the limiting case, 

(a) the expected number of elements in the system 

(b) the mean length of the common waiting line, excluding the elements 
being processed 

(c) the average waiting time of an input before processing 

(d) the average time that an element spends in the system 

19. Four employment counselors are available to serve the clientele of a 
private employment agency Clients arrive in accord with a Poisson 
process at a mean rate of 24 per 8 hour day The time each interviewer 
spends with a client is exponentially distributed with an average interview 
length of 30 minutes Clients are seated in a reception room and inter- 
viewed on a first come, first-served basis The agency manager wishes to 
know at equilibrium 

(a) how long, on the average, a client is kept in the system 

(b) the average length of time a client waits before being interviewed 

(c) the average number of clients in the reception room 

(d) the probability that an interviewer is waiting for a client 

(e) the expected number of idle interviewers 


21.4 THE NORMAL DISTRIBUTION 

A continuous random variable X which has a density function of the form 

/(*) = — — CO < X < CO (1») 

<tv2tt 

is called a normal random variable Here a > 0 and m are constants The name 
"normal” arose in the nineteenth century when it was erroneously thought 
that most real world situations gave rise to these random variables 
A graph of the normal density function is shown in Figure 8 It is apparent 
from (10) that /is non-negative and we state without proof the fact that 

I /(*) dx = f —>-<.-*) W = , (11) 

J a v2tt 

regardless of the values chosen for m and <r * Hence /is a legitimate density 
function 

•Except, of course that a must be positive A proof of (1 J) may be found in Widder, D V , 
Advanced Calculus (Prentice-Hall, Inc , Englewood Cliffs, NJ , 1961), 2nd ed , p 371 

680 / CONTINUOUS PROCESSES AND VARIABLES 



FIGURE 9 



the distribution function of any normal random variable can l then 
from that of a standardized variable For examp e, 1 

(X — 2 - 5 — 2\ _ p , z < i) = 0 8413 
P(X<5) = ; 


P(X < 3) 




Pj = P(Z < l) 


which, interpolating m the table, is approximately 0 630 , 

P(-5 5 < X < 8) = F(X < B) - F(* < " 5 5) 

= P(Z < 2) - F(Z < -2 5) 

= P(Z < 2) — F(Z >2 5) 

= 0 9972 - (1 - 0 9938) 

= 0 9910 

The probability P(—ka < X — P ^ t * iat lvalue is the same as the 
a va hie within k standard deviations of its expecte , hat z falls with- 

Probabihty P(~k < (X - n)/<r < *) = ^'^'babihties are, respectively, 
m k units of zero For k = 1,2, and 3, these pr 
0 ®2, 0 954, and 0 998 (see Figure 10) 


IIGUre ip 




If A is N(n, <r 2 ), the moment generating function of Z is 
mz(s) = £(e' z ) = E{e^-^) = 

Since the last term is the generating function (12) of X with s replaced by 
sfa, we have 

m z (s) = = 12 

Comparing this with (12) we see that Z is a normal random variable with 
expected value 0 and variance 1 That is, if X is N(fi, tr 2 ), then Z = (X — p)/ff 
is N( 0, 1) 

Table 2 shows some values of the distribution function of Z * The fact 
that the density function of Z is symmetric about zero means that 

P(Z < -z) = P(Z > z ) 

for all real z (see Figure 9) For this reason negative values are not tabulated 


TABLE 2 Distribution function of a /V( 0 1) 
random variable 


z 

P(Z < 2) 

z 

P(Z < z) 

z 

/>(Z < z) 

0 

0 5000 

1 0 

0 8413 

20 

0 9772 

01 

0 5398 

1 1 

0 8643 

21 

0 9821 

02 

0 5793 

1 2 

0 8849 

22 

0 9861 

0.3 

0 6179 

1 3 

09032 

23 

09893 

04 

0 6554 

1 4 

0 9192 

24 

0 9918 

05 

0 6915 

1 5 

0 9332 

25 

0 9938 

06 

0 7257 

1 6 

0 9452 

26 

0 9953 

07 

0 7580 

1 7 

0 9554 

27 

0 9965 

08 

0 7881 

1 8 

0 9641 

28 

0 9974 

09 

0 8159 

1 9 

0 9713 

29 

0 9981 





30 

0 9987 


Moreover, since 

P(Z < z) = p(^Ji < ^ = P(X< „ + az) 

♦More extensive tables are available See for example. Tables of the Error Function and its 
DemattLe (National Bureau of Standards Applied Mathematics Series 4!, Washington, 
D C 1954) or almost any text on statistics 


682 / CONTINUOUS PROCESSES AND VARIABLES 











\te see from Table 2 that the system is in control whenever 
LCL ~ ** < -2 33 


or, equivalently, when 


H > LCL + 2 33<r 


When the process .s found to be out of control, the machine ,s stopped and 
adjusted to bring it back within control limits .nfprences 

Since g varies and is unknown, the company is f° rc e ° , samp ] e 

about the process and to take appropriate action on e Qj . random 

of n boxes chosen from the current output Natural y, < , t h ere 

fluctuations, the sample will only approximate the actua va 
are two errors the company may make, either of w uc i m, 

(I) Decide the process is out of control when it is actually in control 

(II) Decide the process is in control when it is out 

The discussion throughout this example follows the general^ ^ process 
in statistical testing of hypotheses, in this case the hyp 
,s in control , . , u rtYP c chosen and 

Let Xi, x 2 , , X„ be the weights of the cereal in the b 

lot 


X=-{X l + *2 + 
n 


+ *») 


be t *' c average of these weights Then the company c suc h that the 

For a proper decision policy there should be a n ^ control when 

E ocess ls assumed to be in control when X c a g p 0Sltl ve prob- 

< c Because of possible sampling errors, there ^ „ = 0 10 

ab % a of making an error of type I We shall arbitrarily 
and choose c so that 

P(X <c | m = LCL + 2 33ir) * “ i suppose 

To tak e a specific example of the effect of the c ° m P J eras m a sample 
T = 16 ounces, a = 0 1 ounce, and take n 
X is JV(g, 0 01/16) and 

PQC < c | g = 16 + 2 33(0 1)) - 0 10 

that c must be 16 201 to each value of M. 

values of c, n, a, and LCL give, correspo S ns shown in the 
Eta ? bllltles of making errors of either type ° in scale on the l " 
Figure 11 (Note carefully the differences 

,r hs ) I error decreases as g 

re U < J > ^ows that the probability of a type „ ty of such an 

‘hat a = 0 .0 is, in effect, the £ s , glll Jicance <* 

*^ 1S maximum probability is usually 

„„ ' 



If X u X 2 , , Xn are independent normal random variables with 

respective means pi, p 2 > > l*n and variances erf, £r|, , °n> the moment 

generating function of the sum 

Y = d\X\ + 02^2 4“ 4“ d n X n 

is 

m Y (s) = E(e aY ) = E(e 8( - aiXl+ 

_ i e a 2 sX z 

Since the X n are independent, this may be written as the product of the 
individual generating functions 

= n E(e°-- x -) = n 

1=1 « i 

= exp(sI>.K. + f t°w) 11 31 

(The notation exp (x) means the same thing as e x ) Thus Y is normal with 
mean ]£r-i a,p t and variance 53,"-! aftr 2 In words, any linear combination 
of independent normal random variables is another normal random variable 
A particularly important case arises if 

Y = n ^2 4- 4- X n ) 

is the average of n independent normal random variables, each having the 
same mean p and variance <r 2 In statistical terms, Y is the mean of a random 
sample of n observations taken from a normal population N(p, <r 2 ) Putting 
a, ~ \/n in (13), we see that Y itself is normal with mean p and variance 
a~/n That is, Y has the same expected value as each of the X n , but a smaller 
variance The fact that Var (T) = a 2 /n goes to zero as n — * oo provides a 
basis for the statistical technique of sampling to obtain estimates of population 
parameters, in this case the mean p 

Example 1 Let us assume that the amount A of breakfast cereal a filling 
machine puts into a box is a normal random variable with mean p and 
\anance a~ From time to time the machine may get out of adjustment, in 
which case the mean amount changes although the variance remains fairly 
constant The net weight printed on the box constitutes a lower critical 
le\el ( LCL ) of output such that the box is accepted if A > LCL and rejected 
if A < LCL ~ 

The process is considered to be “in control” if P(A < LCL) <0 01 and 
out of control otherwise Since 

P(.A < LCL ) = P < - CL ~ = p(z < LCL — 


684 / CONTINUOUS PROCESSES AND VARIABLES 



Theorem 2 The Central Limit Theorem Let X,, X 2 , As. 

of independent random variables all having the same d.stribut ™“d 

and colon mean M and variance <r= Define a new sequence of standardized 

random variables by 


T, = 


X, - M 


T, = 


X, + Xt - > . 

VTir 


and, in general, 


Y. 4- X> 4- + 

y/7ic 


Let fr „ ft. Ft 3 , be the sequence 

and let $ denote the distribution function of the sU 

variable 2 Then, for each real number x, ^ 

lim Ft,{x) = <Kx) 

, , , pit < xT that the random 

In words, the sequence of probabilities i \ n — number 

variables ft, ft, are less than or equal to x, converges to the num 


■S(x) 


= P(Z<x) = /_^^' 2 *' 


Example 2 A. community concert series offers the su™ P ra ® dom decide 
Successive evenings Suppose n patrons independen y attendance at 

' .rf„, .u Then S„. the number m au 


ssive evenings Suppose n patrons indepen en y ■ attendance at 

"Inch performance they will attend Then S n , the jj w]t h pro babihty 
I* E rs t performance, is the outcome of n Bernou i „ wrl te S„ = 

2 L° r “success” (that is, choosing the first evening! the num ber of 

f 1 + *2 + + Y„ where for 1 < < < "• dc " , rand om variables 

successes (0 or 1) on the ith trial The X, are independen 
ea ch having the same probability function 


0 ) = P(X, - 1 ) - 1 Theorem 

‘ f f low s that E(S„) = n/2, Var (S„) = n/4 and ^ " ate]y the same as 
cs *hat S„ has a distribution function which is app 

at of a iV(n/2 > ;,/4) random variable the probability POO 

th Ssurnin B the auditorium can seat a < 11 P^° can be seate , lS 

one of the two nights more patrons attend than 
en a Pproximately by the integral 


m « 1 






(14) 


J—a tickets which 

SCr,es Erectors wish to determine the h* rge ^V . j e%e i a Making *H 
* SoW St, 11 keep m below some arbitrary ^ 

^ or variable z = (4/») ,,2 (* - "/ 2 > - ,nd W (’5) 

= (4/n) Ma (« - "/ 2) „ N / es7 

D ,sTniB^ 10N ' 



FIGURE 11 (a) Probabilitv of an error of type l for various values of p 

(b) Power curve showing the probability of not making an 
error of type 1! for various values of p 



the statistical test procedure Of course, for p < 16 233, a type I error is 
impossible 

For a given value of p, the probability of not making a type II error is 
called the poner of the test Figure 11(b) shows the power curve of the test 
procedure In this case the power decreases as p increases since as p—* 16 233, 
it becomes increasingly more difficult to make the correct decision For p > 
16 233 no error of type II is possible 

Note that at 16 233, the power is equal to the significance level « This 
illustrates a phenomenon typical of statistical tests, that in order to improve 
(lower) the significance level one must at the same time reduce the power of 
the test Usually, the only way to improve both quantities is to increase the 
size of the sample ► 


We have seen that a sum of independent normal random variables is itself 
a normal random variable This is not a particularly surprising result How 
ever, there is a very surprising result, called the Central Limit Theorem, 
w ich states that if X\ % X 2 , , X n are any independent random variables 

having a common distribution function, then for large values of n the sums 

T = Xi -f- X 2 + + X n 

y = -(x, + x 2 + + x „ ) 

have distribution functions which can be closely approximated by distribution 
lunctions or normal random variables This provides perhaps the most 
important use of the normal distribution in applications In practice, n 
is large if it exceeds 30 

A proof of the Central Limit Theorem is rather lengthy so we will content 
ourselves with a precise statement of the theorem and an example of its use 

686 / CONTINUOUS 


PROCESSES AND VARIABLES 



9 If A' and Y are mdependent normal random variables wdhrespechve 
means p and .■ and variances <r 2 and r-, show that Z 

normal random variable with mean M - r and variance + 

10 A coin tossed 400 times yields 280 heads Wou'd you say the coin 
(Use the normal approximation ) 

11 In testing whether the com of Problem 10 is fair, you could mak 
two errors 

Type I Say the coin is not fair when it actually is 
Type II Say the coin is fair when it is actually biased 
Draw the power curve for this testing procedure (See Example U 

12 A company employing 1000 workers maintains a cafe probability 

snack bar If each worker chooses independent y and M ^ 

| of eating in the cafeteria and the company wishes seating to 
seats should be provided in each facility if , 9 

be adequate in both areas on 99% of the wor mg 


SUPPLEMENTARY READING 

Fe j] er » W , An Introduction to Probability Theory 
& Sons, Inc , New York, 1968), Vol I, 3rd ed 


and ,1s (John W " ey 

Chapter XVII 


4 Sons, Inc , New York, 1968), Vol I, 3rd ed , Lnap & Son s, 

E, Modern Probabd.ty Theory and ,ls ApphcaUons (Jo n 
Inc , New York, 1960), Chapter 6 


SUPrLtM E 



in (14) gives 


m « 1 - 


1 — (l/2)z 2 

V2 ; 




If, for example, a — 6000 and we want /3 («) < a = 0 01, then — 2 58 
Solving for n in (15), we find n = 11,718 as the maximum number of tickets 
which can be sold to ensure that the seating capacity of the auditorium will 
be sufficient 99% of the time ► 


PROBLEMS 

1. If X is N{ 1, 10), use Table 2 to find P{X > 11) and P(-4 < X < 5) 


2. Evaluate 



3. Suppose X is N{ 0, 1) and let fx denote the density function of X Show 
that 

4. Determine k so that 


fix) = ke * ( * 1} — eo < x < oo 


is a density function If X is any random variable having this density, 
find E{X) and Var ( X) 


5. (a) In Example 2, suppose 11,900 tickets are sold What is the prob 

ability that attendance exceeds seating capacity on at least one of the 
two nights 9 

(b) What is the probability that capacity is exceeded on the first night 9 

6. Same as Problem 5 when 12,100 tickets are sold 


7. Show that if X is AT(0, a 2 ), then Y = 
with density 


$09 


u^/tt 


X 2 is a gamma random variable 


0 < y < co 


{Hint First write the distribution function of Y and then differentiate ) 

8. Control limits are established on a process with a standard deviation of 
100 such that the process is designated “out of control” when the pro 
portion of output greater than 500, the upper critical level, exceeds 0 025 
Samples of size 16 are taken and evaluated to determine whether the 
process is out of control 

(a) Draw the power curve for this decision scheme 

(b) Draw the power curve for the case where samples of 36 observations 
are drawn Compare with (a) 


CONTINUOUS PROCESSES AND VARIABLES 



9. If A* and Y are independent normal random variables with respective 
means p and v and variances tr 2 and t 2 , show that Z = X — Y is a 
normal random variable with mean p — v and variance a 2 + r 2 

10. A coin tossed 400 times yields 280 heads Would you say the coin is fair 7 
(Use the normal approximation ) 

11. In testing whether the com of Problem 10 is fair, you could make one of 
two errors 

Type I Say the com is not fair when it actually is 
Type II Say the coin is fair when it is actually biased 
Draw the power curve for this testing procedure (See Example 1 ) 

12. A company employing 1000 workers maintains a cafeteria and a separate 
snack bar If each worker chooses independently and has probability 
q of eating m the cafeteria and ^ of eating in the snack bar, how many 
seats should be provided in each facility if the company wishes seating to 
be adequate in both areas on 99% of the working days 7 


SUPPLEMENTARY READING 

Feller, W , An Introduction to Probability Theory and its Applications (John Wiley 
& Sons, Inc , New York, 1968), Vol I, 3rd ed , Chapter XVII 

Parzen, E , Modern Probability Theory and its Applications (John Wiley & Sons, 
Inc , New York, 1960), Chapter 6 


SUPPLEMENTARY READING / 689 



in (14) gives 


m « i 



1 c -(l/2)z 2 

\Zlir 


dz 


If, for example, a = 6000 and we want /3(/t) < a = 0 01, then z a = 2 58 
Solving for « in (15), we find n = 11,718 as the maximum number of tickets 
which can be sold to ensure that the seating capacity of the auditorium will 
be sufficient 99% of the time ^ 


PROBLEMS 

1. If X is Nil, 10), use Table 2 to find P(X > 1 1) and P(- 4 < X < 5) 

2. Evaluate 



3 Suppose X is N(0, 1) and let f x denote the density function of X Show 
that 


E(X\a < X<b) = 


fxia) - fx(b) 
P{a < X < b ) 


4 Determine k so that 


/(*) = ke-'-" -»<*<«, 


is a density function If X is any random variable having this density, 
find E(X) and Var ( X) 


5. (a) In Example 2, suppose 11,900 tickets are sold What is the prob- 
ability that attendance exceeds seating capacity on at least one of the 
two nights'’ 

(b) What is the probability that capacity is exceeded on the first night 7 
6 Same as Problem 5 when 12,100 tickets are sold 


7. Show that if X is N(0 a 2 ), then Y = X 2 is a gamma random variable 
with density 


gOO = 


e -vl2o* y -U2 

~ -1/2 ./ 

2 ffV tr 


0 < y < w 


{Hint First write the distribution function of Y and then differentiate ) 

8. Control limits are established on a process with a standard deviation of 
100 such that the process is designated “out of control" when the pro 
portion of output greater than 500, the upper critical level, exceeds 0 025 
Samples of size 16 are taken and evaluated to determine whether the 
process is out of control 

(a) Draw the power curve for this decision scheme 

(b) Draw the power curve for the case where samples of 36 observations 
are drawn Compare with (a) 


688 / CONTINUOUS PROCESSES AND VARIABLES 



ANSWERS TO 


SELECTED EXERCISES 


Section 

1.1 


Section 

1.2 


Section 

1.3 


Section 
1 A 


Chapter 1 

^ Cnain Aleeria, Mali* Upper 
1. (a) (Russia, Antarctica, U S, E "®' a ^, d ’ "^Delaware, Vermont} 2 W 
Volta, Ghana) (0 {Alaska, Nevada Wyommg and * > 0} or 

{x * IS a citizen of Canada) « } 3 . (b) 4 W 

(2, 4, 6, 8, ) or (x * = 2« where n voted in thre e president. a 

m is a 26 year old U S male who has 1 ga y _ J} 5 . (a) Set of 

elections) = 4> (0 {* * 2 + 3 * 3 (c) «2, 3 )) The intersection of 

points on the line with equation y - 1 ^ (b) 6 (a) 16 12- 

the lines in (a) and (b) ( ) * J element 

„e + ¥( „ - 1)0, - 2){« - 3) for the ^ ^ ^ equa , (a) and (c) 

1. (a) 16 (b) 2", in 00 16 - 2 (a)i (c), (d), anc 1(0 are correc 

t?*--. .5 

15 {A. B, C, D, AB, AC, AD, BC , ^ Rg (c) * U J 


«■" - — “ "" 

(c) 685 

1. Partial answer For the first distnbuUve ^ ^ 

du(snc) = V { * c Vg) n{’aA E ./.k> “,t’nnu(Fnfl)S'' 

UUS)n(4UC) = £ ■” f n L, o c / M < W ( 7 (a) /«'. 

3.(.)Mncv* (c)F r?^ns (O'" (d)/in ^ 

(OMnfl’nen/? 5 * s ' ( d d , st ribulive law twice gives 

00 1/ Cd) « 9- Usmg nOU («nO 

WfUBuenuonO- u IM u « n Cl = 3 1 + 6 

. 3 , . 2 while Mt/I(x)l = M(* + 2) 

17. /t[M(x)l = A( 3x) = 3x + / e 

19. * is both commutative and 


Chapter 2 subscis of A 

Section 1. Truth set of B C On «) g = * ' " « T (b) T ' 

2.1 2.0,4,6,10) 5. (a) P (c)r ' 

9. (a), (b), (c), (e), and (0 EXE nc,srs / «> 

answers to select 



Section 

4.2 


1. (a) |2(— 1)1 = l~ 2 l 
(c) 121 


J- = i — 2! = 2 • 


2 = 2-i = i2|-i-i! wjrij -1 * ' 1 
2 = ,-2|. (d) 12-t-DI = HI = >*- Me ,2 ‘- 1 - 11 

, -.y __ ,31 — 3, while { 2 ' — I - *! 

(e) Sal + '*! = !“ — 6 ! 


2 - 1 = 1 „ 
7. la - b- 
b- Thus, 


Section 

4.3 


121 

™u 

i u ll n\ — IMl is equivalent to 2ao — Q> / x; —8 < v < 5>. 
L, by Properties 3 and 10 10- {*■ el of points - whose d,stance 

14. Partial answer: In 10, {r: l* - 3 1 < * 

from the point 3 is less than 6 uni 3. If m and t 

££ “ > ?* — « * < * 7 - Fol, °" " 


y U ". _999- 
(b; JOOO- 


Section 

5.1 


Section 

5.2 


Section 

5.3 


Section 

5.4 


ChaptEr5 5 Partial answer. Yes. 

! S115 3. Neither, since « < «• “"^Bounded Lower bound 0 Upper 

6. (alii,*. I «, 2 ' fjjund -tapper bound i.andso fort « 

bound 1. Alternatively, lower ^ = ■ < d > “• " 

Bounded. Lower bound 0 Upperbound2. The sequence is 

4n - ^ 

bounded below by Si - S100 000, D ^ M 1S an upper bou 

for all n, then — M < P* — 

lower bound for the sequence _ L Then 2" >1 + " “ 

, 1(1 13 . In Problem 12, Id “ 18 . To obtain the 

4. (b) Approximately 71). * . . > n for all n _ 2 , for 

true for aU n, from winch it ^ 10, choose « - 

result of Problem 8, I et a a 
Formula (4), let a = P and r 


. 4. The sequence is 

, nfl ft 2 3 4 )• M 0. -*• ’• "*• 8 The ’limit >* 4- Problem 7 apph« 

notbounded, hen« ,t does not com« E e. ^ ? 15. Let d = 0, 

ID. (a) 3. (b) No limit. W } ) 

1, — 1, . . .) and B = ( L *’ * a G. On = 

1 c ^ i — rand r < L hm (Ll ^ i-r’ 

3 ' i) t/(l - d (c) Let c = 1 “ 

aic .-i + L (Lp£=-) - U - indUC,IOn ' ^ 7 . This follows from the 

V 1 7 A = 1 whenO <« < 2 ,,.0 

and L = a Then hm W> ? parallel rhe *?«*»*? % for 

contraposime of Property 13 . ifa Use Example 

12.0 Use Problem 13, Sec othcIW «e. (P.) 

0 < a + b < 2, hm (P-l - ’ 

3 of Section 5 4 with a = 1 ~ 


„ SELECTED EXERCISE! 

answers to SELE 


/ 693 



Section 

2.2 


Section 

2.3 


1. The truth set of ~p is P' Hence, by Definition 2 (c), the truth set of ~p=*q is 
(P')' \JQ = PUQ 2. No 4. (a), (c), and (0 5. (a), (b), (d), (e), (0, 

and (h) 7. Try p and q true, and r false The statements are not equivalent 

5 Valid 7. Valid 9. Not valid 11. Valid 13. Not valid 17. (a) 
(i) and (m) 


Chapter 3 

Section 1. (a), (c), (d), and (f) 3. Let A = {«} and B = {£} (a) {(<£, 4>), (<£, A), 

3.1 tf, B), (<£, P), (A, A), (A, P), ( B , B ), (B, P), (P, P)} 4. {(1, 2), (2, 3)} 

8. (a) {(1, 2), (1, 4), (2, 4)} (c) {(2, 2)} 9. 16 

Section 1. Let R = reflexive, S = symmetric, and T = transitive (a) R, S, and T 

3 2 (c) S only (e) S only 3. One possible answer The sets {a, c}, {6}, and 

{d, e) partition U 4. Partial answer One equivalence class is {Jefferson, 

Monroe, Buchanan, Grant}, the set of presidents born in April 6. (c) and (e) 
are equivalence relations, (a) is neither R nor S nor T, (b) is T only, (d) is S only, 
(0 is T only, (g) is R and T, (h) is R and T, (i) is R 14. Problem 13 applies 
since I/C{/ and A Q B 

Section 3. Let R, I, S, A, T denote reflexive, irreflexive, etc (a) I, S (c) I, T (d) I, 

3.3 S, T (e) I, S, A, T (The relation is 4>, the empty relation ) (g) I, S, T (i) R, 

T (k) R, S, T 5. (a) “is poorer than" (c) “is less popular than” 

6. (a) p => p for every statement p Hence / is reflexive If p ==> q and q=>r, then 
p=*r Hence / is transitive (c) p <=> q means that p => q and q => p That is 
pEq <=> plq A qlp 8 (a) The relation “has at least as good a racial balance 
as” (c) The relation “is at least as risky as” 9. Every equivalence relation is 
reflexive and transitive 11. P is asymmetric so we may have either xPy or yPx, 
but not both If neither xPy nor yPx, then xly holds, by definition of / 

Section 2 (a) Not consistent If x lives next to y and y lives in the same precinct as z, it 
3 4 does not necessarily follow that x lives next to z (c) Consistent (d) Not 
consistent 7. Not connected Subject may not prefer either xory 

Section 3 (a) (12, 15) — (16, 10) — (20, 5) — ► (14, 5) — ► (16, 3) — » (14, 3) — ► (12, 3) 
* ~~ * ^ (12* 1) 4. Every pair of elements is now a comparable pair 

14. (a) 6 (b) 0 (c) 9 


Section 

4.1 


Chapter 4 


1. Domain - {Robert, Sam, James, Teddy), Range - {James, Teddy, Sam) 
Domain — Range = set of all real numbers 2. 6, 9 3. C(A) = D, 

C(B) = {a} C(A UB) = C(U) = <f>, C(A n B) = { fl , b, d) Domain of C is the 
set of all subsets of U S. (a) and (d) 8. Assuming the relations consist of 

all pairs (t, >) which satisfy the given equations, only (a) vs a function 10. (a) 




a logio24 - logynS = j 427g 
logio3 


692 / ANSWERS TO SELECTED EXERCISES 



Section 

7.3 


X. (i + r) 1 = 1 + 4( + ®' 2 + 4,1 
_ 4s)3 = j c - 12s 5 + 48s 1 - 64s 

(S ' -- , 120„V 


2. $2650 


(0 Wx 6 / 4 ^ 10 ^ 24 

4 (al I ) flV "= *4uov- w V 6/ 

5 ‘ 1 ’ W s ,024, 5 + 1792/ - 1792/ + «» " 

(e) 2** V <B> 0 7847 '. ® ^ Put a - 4 - l - « " Et > uatl0n (5) f 

12 + Vo U should "encounter the binomial expansion ol 


('?)■ 


(1 - I)"' 




= 4200 


Chapter 8 


Section 

8.1 


1. 2 X 3, 2 X 1, 1 X 3, 3 X 5 


O V 

J AA _ ~ A 4. (a) (4 7) 

' ' ^ //■» c\ . I n 


, _ _ /2 6\ = A + (B + a 

S.A + B=(\- i ^B + AAA+B) + C={ 7 _ 7 j 

,V 27 . / ) 5\ ^ + B’, 44 = L 12/ 

- 4fl = (—13 13) ’ iA + B) ' “ ^ a ; r | ces 2 lse induction 15- A - B = 

(44' XX. Already proved for wo £ + = £ ta. + 

X + (_B) Use Problem 11 17 ‘ X 1 » -1 

W= £n, .+ £*■• = tr ' 4 + ,rB 


Section 

82 


Section 

8.3 


Section 

8.4 


M - £<.„+£. „ 3.J5870 

11. (a) / (e) (_1 o) (l) (- 1 °' ___ V- c 0 

18. Partial answer du = Vl8 and di4 17 -12\ 

,4=ni) 4-ui 

B = diag (di, - *) and £ ^induction 8 ' L " ” x 

,. W + ic *«(* a + » 

Then £ d, = laD's 

/ -1 , 0 \ (el A °A 

(d) (til - ®M (4W V* ' , 5 , he nu mber of persons who choose 

3. Partial answer The ith diagona 

person, 5. col (5 2, 0 4, 0 2) 


6. (a) If 
' 4 , c„). then CD ” flC “ 

8. Let /3 = CL L ) 


Chapter 9 


wif>-M hcnJC ” 13_ n 

■> rbl Inconsistent t c) 1 ' * 5. ra “ 

and“z = 26 -’9, b may t "iTumtar ^ " f ^ ’ll, 2 "> 

^ Any Wh,Ch “ ° P0S, “” 

12. The interior equilibrium 1 ^ mlcno r equilibrium 

multiple or the vector (7, 25, ) cn rxcncISES / 635 

.„.wtnS TO SELECTED 



Section 

6.1 


Section 

62 


Section 

63 


Section 

64 


Section 
6 5 


Section 

7.1 


Section 

7.2 


Chapter 6 

i. (a) 5 (c) 10 3. (a) Z 4 (c) £ (a + kd) (e) £ (*, - x) 2 

<t=o 2* *=o »=i 

4. (a) 1 + 2 + 2 2 + 2 3 + 2* + 2 5 = 63 (c) *(- 2) + *(— 1) + *(0) + *d) 

12 

= -2k 6 All sums equal 68 7. £ 2 k = 2(78) = 156 9. 3124,622 73 

tt= 1 n 

11. Apply Problem 11, Section 5 2 with a = 1, r = n, and n - L 13. £ (x, — x ) 
= £ x, — £x = nv — nX = 0 (Use Rules 1 and 4 for sums ) 15. (a) $5301 95 

(b)’ S2246 27 = ' (c) 

(1 04)" - 1 

1. (a) Approximately 1 05 3. i + f+ T 5+iT 5. 0 + |+^ + | 

9.0 + i-i + f 13. l.I.M.fil 15. (a)* 

(b) 1/(3 4“ l ) (c) 2 (d) Series diverges (Why'’) 18. $400 per year, 

37. (1 - c) E c'y, < (1 - c) E c'Y = 


- = Y 


forever, an infinite amount 
(l - c)Y E c = a - c)r- 

«■— 0 1 — c 

1.0 202020 2. 2202 1111 ,244 222 3. (b) 

-11010 0100 = -26 25, the difference of 86| and 113£ (c) 11110001101 II 

“ 1933 75 » the Product of 110 5 and 17 5 (e) 11 000 ~ 3 0, the quotient of 

and ? 4. 0 0100110011001 + 0 011001100110 = 10110011001100 

= Tff 5 * ( a ) 101 111100000001 (Binary) and 57401 (Octal) 

(d) 1000011100001 000100011 (Binary) and 10341 043 (Octal) 7. (a) 421323 
(c) 53333 (g) 13 770 8 (b) 71 6 a 7 (d) 6666 10 65535 

J; D /r. E r 3 Convergent S - Convergent 7. Convergent 9. Convergent 

J* W “ c + cM ) 14 - Converges for -1 < x < 1 (Ratio test fails 

1B C aw ^ 16 ' ^ all ° test fads . but comparison test shows convergence 

Wnrm e mS ^ 1“* Shows conver 8 e nce Second series is a divergent hyper- 
21 Cnnv SCneS w 19 1 ^ lverges Se Q ue nce of terms does not converge to zero 
21 Converges absolutely Ratio test applies 23 fbl ti(\ - n/fl _ n\) 

24 (a) e, = (-1)*-' W , (c) d/ lL d/162>m ^ (b) " I)/(1 " ™ 

4 K {'hih + X 'i + *‘ 3 + + *23 + *24 3- f 

4. II |*l < 1, both sums have the value 1/(1 — *)(1 _ *2) 

Chapter 7 

2. 6, 40,320, 39,916,800 4. 20, 495, 72 6. 480 7 11 9 44 

10. (a) 0 (d) 109 II. (a) 2 (c) 16 13. 2'° = 1024 14 (b) After 

roar elections, Democrats 25,488,000 and Republicans, 34,512,000 15 . J 

1. (b) 64, 24, 20, 4 3. 1000, 720 6 120, 10 7. 1, 1 9, They are 

equal ,4. 120 15. (b, 60 ,6 (a, (»\ _ I5504 , f samplln8 „ wllhout 

replacement (b) About 207% 19. (a) 1260 (b) 156 20. A i, D, C, 

and D £each 23. (a) 4, 7, 11 (b) Q + i 


694 / ANSWERS TO SELECTED EXERCISES 



15. D = diag (0, 0, 9), Q - 


D = diag (0, 2), Q 

/ 1/V2 1/VTS 3 

-1/V2 1/V^8 s 

\ o 4/vra 


('.-!) 


/_5/\/66 1/V2 2/V33' 

17. (a) D = diag (2, 0 32, 0 68), C = 4/V66 

/— 0 87 0 40 0 29\ 

F = ( 070 0 072 

\ 0 87 0 40 0 29/ 21. (PT)' = W ® 

18. W,spos, live deOnite, the rest are neith ons as fte orlgina l matrix 

22. Yes, the transpose must have the sam 


Section 

11.1 


Section 

11.2 


Section 

11.3 


ChaPler11 x ,U2=/(4) = 4 a + 6 solving 

4. If /to = ax + b, then 8 - /(2) J"J’ 2x + 4 6 (a) * 1 an 

for a and b gives a = 2, 6 - .an ^ rep i ace „ by „a in fOD 

x = -2 7. (a) /(« 2 > = ' ,, , / c \ „(„) = (1 - r)/r 

for all n .he limit is not 2 2 5 

1. Only (c), (a) and (b) contain the.rhm,t2 w > ^ 0n]y (d) and(e)are correct 

3-2 5.-1 7 CW^CW- hm^TrT - 2 , 

12. Marginal cost at x is lim _ x * entire set of real 

is the set of all numbers except | Hmts For Example 2, use 

. . M 0 (e) 0 5 - " , c2 > 5 For Example 3, use 

n 1 . « 3. (a) ■» , (c) " lhe r ac t that if S > J ' S > g 0 Hint Multiply 

the result of Example 1 and the tael 2 and * = e » 

Problems 13 and 16 in Section 5 2 with 

and divide by (** + « ' + * lim , t , 5 f. but the . 0 

r i •twrr- » ^ - - - 

5. use Example 4 in Section H* - J* ~ T/. Use Emblem 20. ^ 

Intermediate Value Theore 2 2 t hat section to If /( jc) = 2x» thcn 

to relate (a) and (b) Use of /at « .ft* ,f = * + U 

of course, is the deflmtion of continu. + > + 2 > + 1 

/(* + 1 ) = 2(* + >) = f + l y ol equal to /« + ' (,) 

/(x + 3) = 2 (x + l) + > 1 (jr) , loginv 

ISOOUtFW-* (b)L , ec TED EXERCISES / 697 

answers to selecte 



Section 2. EB is the matrix obtained from B by multiplying the first row by 3. 
9.2 4. / 1 0 0\ /I -2 0\ /I 0 -M 

( 1 1 0|,[o-l 0,0 1 yV . 7. No 9. Yes 10. No 

\— 2 0 1 / \0 6 1 / \0 0 YfJ 

Section 3. If A = 0, then AB = 0 for all B Hence, there is no matrix B such that AB — 1 
93 6. Inverse. (| f) Solution- xi = « = f 8. No inverse, no solution 

10. No inverse Solution: * = lOz - 6, y = — *^z + f, where z may be any real 
number 16. (a) The corresponding rows are interchanged in the inverse 

(c) In the inverse the corresponding row is divided by the same constant 

Section 7. Solution set empty 13. (b) (0,0) is the only interior equilibrium 

9.4 14. Only at (0,0) Otherwise, no 15. 021012 51 1. 

Section 2. S144, S216, and $0 3. (a) In Problem 1, corner points (0, 0), (0, 3), and 

9.5 (9,0) yield respective functional values 0 (minimum), 18 (maximum), and 18 

(maximum) 4. The maximum value is $17,000 obtained at comer point (16, 1) 
6. (a) The profit function P = 8x + I2y attains the maximum value $220 at 
every point on the line segment between corner points (^, ^) and (^, 0) (b) The 
inequality |x + 3> < 100 pertaining to Machine II 8. R — 12,000 when 

P =* 2000, F = 2000, T = 0 9. There are six corner points, in terms of a, c , 

and It, respectively: (0, 0, 0), (0, 0, 15), (0, 20, 0), (30, 0, 0), 0, 15), and 

(10, 20, 0) The maximum net return is $10,000 from 10 acres of cotton, 20 cows, 
and no hens 


Section 

10.1 


Section 

10.2 


Chapter 10 


1 . 26 


5 . -27 


7. v 


_ ft + /o + Go C = a + bla + bG 0 


t - 6 \ - h 

9. For instance, cot (-1,14, 9) and col ft, -J£, -l) 12 . If row i is c times 

rowy, subtract c times row, from row r Apply Theorem 1 

3. Roots 0, 8, corresponding vectors (c,c) and (c, -c), where c is arbitrary 
5. Roots 15,-3, corresponding vectors (c, 2c) nnd (c, -c), where c is arbitrary 
7. Roots 1, 2, 3, corresponding vectors (7c, c, -c), (c. 0, 0), and (c, c, c), where c 
is arbnrary 9. Roots 3 and —3 (each a double root), corresponding vectors 
J ’ , + c ’ *5 * and (~ 2c ~ d > ~ 2c > c, <0, where c and d are arbitrary. 10. 

(a) l and 2 (b) a and b 13. Roots |(a + d -f V(a^dV + 4 be) and 

i(a + d- \'(o - </)-■ + 46c) (a) (» _ if + 4 bc > 0 (b) (n - df + 

0 (c) (a ~ d)~ + 4 be < 0 

2232 4098 5964 9062 9296V 

0134 - I09S - 2330 - 1696 - 3928 j l6 ‘ (a) 2 ’ 2 ’ 3 ’ and 2 

(b) 0, 3 


4 be 
15 . F * 


Section 3. ~2u- + 6ut - 10t 2 
10.3 

2) xU 7. (x >) 


5. 2)0 2 + 4) 



133 + 6)1)1 + 2)2)3 + 4>234 + 


9- (x } 



696 / ANSWERS TO SELECTED EXERCISES 



,9. DAKM = - 0 .fa^ only 'fr'(x) 

(b) x = 5. MC( 5) = sidS) = -f 
Section 4. f'(x) = 3x 2 + 6v — 2 ,/"M = &x + ] * 


r(x) 


20. (a) x : 


S. Dg(u) = 5a 4 


- 6» 2 + i 


12.9 


D-g(a) 
9 v— 5/2 


, w"(x) = — 7* 


20 " - 12 " _,r„ WTr 2 /"(y) = (2 + 3^)/y 3 ' 2 (2 + VJ) 

9. /'O) = ; (z + * 13. rr'(x) - -(x 2 + 

„ .. ,.!_a, 4 .i.K- 4 , 6,0 (c) 

«"(v) = x(x 2 + n 2 )- 3 '- 


2 (2 + V’.n'L X' VS g'M --(*»'+ a 2 ) -1 ' 2 

11 . /'(*) - 2 e — (1 - ( ~ 2e J'_ J + ,, fa . * « « — 

sin x, e -1 - cos x, -c~* + sin x, e 1 + cos x 

D 2 RC( 1 ) = pa /~ 3 > 0 

0 <6’ 


” f ft T", 

17 - AU = b^»J 

19. Ifp>i. Choose What happens it 


Section 

13.1 


Chapter 13 

1. (a) 0, 0, 0, 0, 0 (b) 0, 0, 0, 0,0 5 ‘ (a > °’ -f, “A 

7. (b) no value. 4, *, *. V«0* + » ( « + 2) 


/(« + 4) - 4/(« + 3) + 
16 (a) 


Section 
13 2 


Section 
13 3 


-F* i 5 i 2 o.o 

section 13 1) 

P) 2 Mr), (« - ^f fiim - P0)Q(f + 1 ) 

i Gwec^+i)^ 

I K 21 r - 

1. (a) k 2 1 


2 (a — p)JV((), (o 
9 MC(n) = a - *-‘< n+1, (« + 1 


= A[fc 3 /3 - kV 2 + */«1 = A[ ^ (f: 1K2A ^ 

- . .. 1 \/*» t — - 1 1 = 1 


71 

L k* 


Section 

13.4 


Section 
13 5 


Section 

136 


(b) AT(n) = 397 + 4 + f" 2 - *“ e 3/3 + ,2 + 21/3 (W ' + 

(e) Minimum ,sT(0)= 10,000 5 ' <a) ' + 

(c) 2 2 1 + t . , ear a nd (c) and (e) are 

1. (a) and (b) are ^^^'‘"TfofaU n, *enK» + » ' {Jgji ^jc = ] = 

ZZZ r - »' 5 - (b) C1 = 2 ' 

3 n(n + l)/2 (e) h(n + ~ JAW (W „(,) = 5l 2 + 5r + 10 

C2 = 4 « (a)g(r) = 3,/2 + l (b) gO ^ sMam6 , „ 2- 

2. no *= (10 - 4/)2< 3 ‘ y(r) 1 3 2' diverges to ™ 4 y )n “ 05C , Hates 

7. solution no = a + i'> 2 oo 13 rw = -<■ V 

11. g(n) = 4& converges^ zero as 
and is unbounded as n ~~ 


w .. 

ded as n -*• 00 . lf . on then >p(" + 

, „ „e 1S the particular solutio ' (8ci _ 12c 2 )c 

1. If y,(n) = co + C‘" + « = (8co _ 6c i - ,2c2) + _ 3 3. 

6 yM + 2) + 3 y r (n + 1) + C2 = 1, c = 2 . and co ^ ^ 

Scan 2 Equating this to 8n +4 8 eqmhbrlum price P a „ a .y+b- 3" 
equilibrium, D =* S Thus w/t — d) S.Ji ) ^ y ~ 

a \ bp = c + d p to get P = (c - 9. *00 - ° + 

7. Y(n) = a 3- + «-')■ - "' /2 3n/ rlsES / 699 

SELECTED EXERCISES / 

ANSWERS TO 


> + I 



Section 

12.1 


Section 
12 2 


Section 

123 


Section 

124 


Section 

12.5 


Section 

12.6 


Section 

12.7 


Section 

128 


Chapter 12 

3. N'(t) = cN(t), for some constant c 5. /■'(*) = kx 2 , where k is a constant 
7. Let E{t) be the amount of energy expended by time t Then,£'(/) = k[p(t) — p(i)\, 
where k is a constant 9. 0 02, 1 XI. 0 704008, 35 2004 13. /'(*) = 

2 - 2t 15. Z)n(r) =3 + 8/ 17. ^ = 2/ + 1 18. (a) 4 (b) 6 

at 

21. (a) The slope of the tangent at x = 0 is /'(0) = — — — (b) 64 

(c) /(25) = 48 fc + “) 2 

6t 5 , IIa 10 , 3a 2 3. /'(a) = 8a, g\t) = -18r 5 + 12 / 2 - 2, ft'(u) = 8u 2 - 1 

5 a '(=) = 4 + z 3 7. !/'(/) = 3r 2 + 6/ + 2 9. Dw(z) = 4z 3 + 3z 2 - 4z 

11. z'(a) = 6/(ar + 2) 2 13 /'(at) = (a: 2 + 2a: - 10)/(a + l) 2 15. Du(r) 

- 4 - 2/- 3 + (2 - 2r)/(r + 2) 3 16. When a: = 2, dy/dx = 36 18. u’(a) = 

in 20. (a) f (b) 0 


1. At a = -§\/3« -1 155 and x = §\/3« 1 155 3. /'(a) = 3a 2 and 

HTo “ — — - 1 >'/(<•) - 1, then c = V3/3 = 0577 6. (a) In 

creasing for aU a (c) Increastngirz < -4orz > 1 , decreasing if -4 < z < 1 

7 ’r \ ^ ,7 + (b) c = 1 9 . Apply Rolle’s Theorem to the function 

sCv) =/(a) - /(it) 


1. 2(a 2 - 2a + 3)(2a - 
7. — 4/(/ 2 - 5)“ 3 9. 
(6 + 5r 2 )— 3/2 
-I>Q (/>)/0W - 1 


' 2 1 3. n 2 /(n 2 + a 2 ) 3 ' 2 5 _3m _ 2z) 1/2 

-^(7 - a)-i' 2 (12 + a)- 3 ' 2 11. (20/ 3 + 43/ - 18) X 

13 If Q(p) _ c/p, then Q'(p) _ _ c /,,2, so £(j>) = 


2. (a) 2 (/ l)sm(/— l) 2 (b) —3 a 2 sec 2 (—a 3 + 2) (c) — at, _ist/2y 
sm 2a sin (a - l) 3/2 + 2 cos 2a cos (a - !) 3/2 5. k<> - 2y 2 sec 2 y 2 + 

tanj- 7. g'(x) = 8a tan 2a 2 sec 2 2a 2 9 rriV= ,L„ l at 

is - *^«*.~x* ’ 

15. 2 x sec ArU tan at + 1) 17. >(a: + l)-i/2 sec 2 {x + 1)1/2 

I FoMns.nceao.A^andPA™ 3. No 6 S.nce/(, )./(-!). 1. 

D„(/) _ -2/iv'TTTrT ,, t V,. , , n - Slnce C32 ^‘ ' 2 = sin" 1 (I// 2 ), 

any function with donum S and range T Then 2 > 2h ! AZ: ?£ t/be 

2R2, but not/(2)Q7(2) e " 2 - 2 ' but not/(2) < /(2) That is, 

1. 1/(a + 2) 3 2- In 2 5. 2 a>[I + 1„ ,] 7 . _ 

"• c (sin M 4- COS JV-1 11 In,, A, . X '^ X J e 

I I (a) 16. About 2109 a d 84 - (log* e)ln $(at) Now use Theorem 

1./0/V2) a -\/2 is a minimum ,/(-l/v/ 2 ) = _ ,, 

-1 is a minimum, /,(!) = l 1S a maximum 5 « a maximum 3. h(- 1) - 
7. M-v) =. AA is a maximum n = 4 is a maximum 

,sa n,,mLnJ t ;^ r , STs-a mL7m U m ' "TZ ^ 

*• attains its maximum value, 

zero, at r - ± , ± — rh,™ 

2 2 ’ 2 ' Thcrc ls no ounimum 13. Maximum 


and minimum values occur at the points where tan M = £ 


14. 1 


16. i 


698 / ANSWERS TO SELECTED EXERCISES 



, T ., /r„2 4 - b") 11 You can replace cos- x m 

% ° - ”” d n “■ ” 

aconsl£mt + l) s. 2 fr + as- 
sertion 1. In (19), Choose g(v) = /(*> ® ■«*/<>) > , 2 + , 9. § In (2 + sin* 0 + C 

14 8 4(1 + 2) 1 ' 2 + C 7. 4(/ 2 + O' + C _ “ ‘ 13 is,n-i(3i./V2) + C 

11.X Ian- (X/S) + C Substitute ton • - */* _ 

17. — (; 2 + 9) 1/2 /9t + C > 9 sin V 3 / 

21 . -(1 + -);':‘ / 2 ( " lrt ) ) + 27 Vz "/!^ 00 

“•^fc 25 31.V + 9 ^ /16] 

35 i tan 2 a + In |cos «l 

nlJ . r 3 * + 21n|(*r?/(* + 4> l,m C 

Section 1. In |(x — 5)/(x + 5)1 + 7 

149 5.*ln2 + ^ln3-^lnll 7... 

(1 - O )v 0 e a - > *'/[l - a - JoO 


15. -V sin -1 V2 x + C 
V9 - (> - lP + c 

ilnU+x 2 l + C if«-- - c 

29. |* - i sin 3 x cos x a s ' n v j tan 2 „ + |cos a| + C 
33. —5 cos 20—1 cos 48 + C 


9. In 5 — § In 2 U-XO 4 
13. Using x(0) — a + 1 


- el 1 -** 1 )) 

V J L „ 7. Divergent 

Section 1. Divergent 3. 2 5 - l2er 16 r /6 18 T / 4 

1410 11.1 33.1 Jf ; Divergent « / 

21. b> - 1 24. 3 4 

Chapter I 5 ^ = c , lM 3 ,_* - 

Section In the following answers, C denotes a ‘ c onsU ln | x | + C 7 - C| 'l 

15.1 C(x 2 -l)-'-l 5. y + in 111 - * + 


9 y = x 3 /[Cx 3 + §1 31 * /W ** 


9. Divergent 

19. < 0 


iv _J— 

13 S = In tan ^ ^ 


_xtan-*0n*Wl (b) * 


= C - x/2 


tan-' (V2 cos 0)1 16 W Y x ‘ ‘ . u$e Theorem 4 and Example :3 

Section 3. The operator is fl 2 - f ^ () ( f ( 7 - A 2 /108)e' 6,/t + ' 2 / 6 “ *^J sin 2i + 
15 2 4. HO = (6 - 50-f' 3 ^ 2 X 9 . (2 , - r 2 + 

(b) If c > 0,P(0 — 00 as( , ”1 
17. First, Ki(0 = Kr(0)e“', then 


F 2 (0 


-[ 




ori?C + IJ 


a/?c + 1 


Chapter 16 5. 6 + 3* + 

Section 1. 0 967, error < 0 0013 + 

16.1 4H+2d + ,' V+ „ >1. £ (21 4- .)' 

? H,tn 2f,V 9 - *£< _ x s /6 + x 4 /24, error 

X 3 X S X 2 , 15 l-r +,/ - 

13. v+-+ 7T + ^+ 17 . v + 37 • rrro' 

-C-.XV5'. where c is some number bCW ^° “" J „ and T 19.1n-e2- 

(1 _ c2) -er 2(9 + 720= -r 24c 4 ). where dies be. 


,-r.FCTED EXERCISES 

answers to se 


/ 701 



2” + In' 11. T(n) = o(-§)* + iffi" + (sr" “ in') 2 " 13 ‘ /W = 

(8/ - 9)2' + 10 15. Y(f) = 2 3' - 1 - 3 1 18- (a) Setting SO) = DO) 

gives 3P(r - 1) = 10 - 4P(t) or (4A + l)P(r — 1) = 10 (b) P(l) = c(-|)' + 

V 20 - (b)/w “ I r©” + ^ 1 . 


where a is arbitrary 22. (a) The principal P(n 4 1) equals the principal P(n ) 
plus the interest, 0 06 P(i i), on P(n), minus the payment R (b) P(n) = 

[P(0) - 50R/3K1 06)” + SOR/3 To make P(k) = 0, choose R = P(0)(1 06)7 

[1 06) fc — 1] 23. (a) The baste difference equation is NO 4 1) = N(t) — 

PNO) + alN - NO)] (b) As <-> to, Af(r) -» aAl/(a + 0) 25. (t) P(r) -♦ 

(a 4 c)/(b 4- d) as r -» « (in) P(r) oscillates but converges to (a 4 c)/(b 4 d) 
as t — i oo 


Chapter 14 

Section i 3 2(6 — a) 5. f 7. § 9. 6 10. The area below c /is c times 

1 4.1 the area below / 12. Area above [a, b] is area above [a, c] plus area above [c, 6] 


Section I. Iff, H 2 (» 4 l)/2n, (n - l)/2n 4. \n, -in 6. (n + 1) X 

14.2 (2„ 4 l)/6n 2 , (n - l)(2n - l)/6» 2 9. 4, 0, J, 2, |§ 11. (a) 12 

(c) 1 (e) i 


Section 1. x 7 /7 + C 3. jr 4 /4 - 2.v 3 /3 + 3 a: + 4* 5 + C 5. 3z 2,3 /2 + C 

14 3 7. (1 4 ar 2 ) 3/3 /3 + C 9. (e- - 2)73 4 C 11. -J cos V 4 C 

13. at - In |* 4 2| 4 C 15. i In (2 4 sin 2 I) 4 C 17. e*'» * 4 C 
19. 4 C 21. 2 In (e‘ + 1) 4 C 


Section 1. -4 3, 455 

14 - 4 13 § is m-f 1 


5. V7 - V3 7. 0 
(16 - 4 x 2 )dx 


9. } - 2 In 2 11. 2[e^ - e] 


Section 

14.5 


■/; 


fo(*) - g(x)]dx + / [h(x) 


r 

J 1 /: 


/• 1/2 

3</a: + 


ri/2 r8 

/ 3<£* + / (4 — y/2x)dx = 3**- 

•>0 ■/ 1/2 


V^ inCe ^ *, F(x) — {—A/ In m)m 1 + C Imposing the condition 

14.6 F{ 0) = 0 gives C = /1/ln /x Hence, F(x) = 04/In M )[l - m” 1 ] 4. Let j = 

100,000 m The average total repair cost is approximately 

/•400000 - 1 

/ A (j/ 1 00,000) 1 t2 ds = 1,600,000 A/7, 

J o 

CO 50 

5. (a) E loonvp(t) = E -4prtv(§)> 

(b D^/f'^r^dnl)- 2 

7. (a) 7028, 702 8 (c) 3559, 355 9 

11. SPM + + 1)5(25 - /1A - ^)/(S - /t) 


Section 3. (a) <x 2 /4)(2 In * - 1) + C (c) x sm ~ » x + (I - *2)i/2 + C 

14.7 (e) +(!/*) X cos (l/x) - sin (1 /a:) + C 4. (a) (25e 8 - l)/2 (b) tt - 2 


700 / ANSWERS TO SELECTED EXERCISES 


i tii /r„2 4 - h 2 '! 11. You can replace cos 2 x in 

1 - s,n' \ “loftha, .he answers in .0 and 11 differ only hy 

a constant „ 3/2 „ 

. , ,,,v _ „«+t/fn + 1) 5. 2(t + 2) 3/ "/3 - 

Section 1. In (19), choose g(x) = /W ' . / ,i + , 9. a i„ (2 + sin 2 t) + C 

14.8 4(r + 2) 1/2 + C 7. 4(r 2 + r) 12 + C Let «, t + ^ + c 


4 (/ 4 - z ) — r *♦ • ' ,c 

11. i tan - 1 (x/5) + C Substitute tan 0 = x/5 


15. 


V2x + C 17. -U 2 + 9) 1/z /9l + c 


19. sin -1 


( t T J )' 


Section l.Jjlnlfx- 5)/(x + 5)1 + <- 7 ^ ‘ 9 . ln j_Jln2 ” »' rt = 


11. y« = 


*>-&- *•* » W ~ 7 . Divergent 9 . Divergent 

Section 1. Divergent 3. 2 D . -/fi 

14.10 11. 4 13. 1 IS- Divergent 16 */> 


13. Using x(0) * a + / 

Divergen 

18 x/4 19. * < 0 


21 . fc > -1 


24_ 31— _ 4‘-“ 36. A/r 

Chapter 15 


1 fill = Ce 31 ’ 12 3 - " 

Section In the following answers, C denotes a c° nslmlt ( | ' + c 7. y - Cl 1 '! 

151 C(x 2 - ll- 1 - 1 5. y + mW = * + lnw T , fir J_ x 

„ n rtjrt-2*-"* 13.S = lntan|— V2 X 

9. y = x 3 /[Cx 3 + §1 u - -" x) c i xjl 

tan- 1 (V2 cos 6)1 16- (a) y = x lan [1 " 1 ' Theorem 4 and Example 3 

Section 3. The operator ,s D 2 - 6 2 -(*> - »$ + » S?“/6 _ tr/18 + lV«S 

- * SW - 5 J iS&ZZ&Sw- i 

15 cos 2(> 4 * 7 14. (a) W 0> ^ equilibrium state is (u - «)/^ b 


152 


Wife > 0,P~(r)-> “ as / — r 00 ,fc < 
17. First, Ki(l) = V i( 0 )a»‘, then 


1/. rirsi, r iw — ' 1V •* ; n 

T i?/3 Ki(0) ,/*„ 4- 


j?/3Ki(0) e o< 
a/?c + 1 


Chapter 16 5 6 + 3v + 

Section 1. 0 967, error < 0 0013 , 3 ',°* in^+x 2 ! 1 " 2) ‘ /2 +/ 1<ln 2)7/3 + x st 

16.1 4x :+2 x3 + x 4 V + ...r/^'an-rv 

Zy0n2)‘A- 9. 4 £ ( ** 1/6 + v 4 /24 . error 

. t + x 2 /2 - *V« + X/ ' 

17. x + £ • 11101 


X 3 , x= xf, . 15. 1 

13. x + — + -jr + 7 , + 


19. 1 + r' 2 ' 


-e->xV 5 - where c is some number between 0 and x 
answers to sel 



Section 

162 


v2/s + jc 3 /16 - 5 jc*/128 + 7* 5 /2S6, error -21(1 + c)' 11 ' 2 / 1024, where c lies 
between 0 and x 21. 5/T05« 1 0246882577, eorrect to 9 dec.mal places 
2J. i + = 0 5208, rr/6® 0 5236 25. 9 025 27.0 669 


1 . a n Yj 

1=0 


(£j (x/af, radius of convergence is R - i<i| 


3. £ I 

it o 


)x‘,7!= 1 


7. £ ( 3 f )*!,.«= 1 11. 1 3176 13 0 4613 15.0 2726 17. (1 - x)-~ = 

D (1 _ x)-> = D j: (~‘) (-1 fxf = J^T,') (Use Problem 2 ) 


Chapter 17 

Section 1. 6S/dx = 2xy — y 2 ,9S/3y = x 2 — 2xy 3. dz/dx — x(x 2 4 y 2 ) l ' 2 -> 

17.1 dz/3y = y(x 2 4 y 2 )" 1/2 5 0//dr - s" 1 4 " _2 . ^f/ds = — r 

7. 0g/dx - 3x 2 + 4x - 6y, dg/dy = -6x - 12 9 - V3/2 11 . d 2 u/dx 2 = 

—y sin x, d 2 u/dxdy - cos x, d-u/dy 2 = 0 13. d 2 g/dx 2 - y 2 z 2 e xv % d 2 g/dy 2 = 

x 2 zV v *, d 2 g/dz 2 = x 2 y 2 e Ivl , d 2 g/dxdy = z(xyz 4 l)e ruz , 9 2 g/6xdz = 
y(xyz 4 l)c'*",aVaydz - x(xyz + l)e*«» 18. (a) r = (1 - 4 ^) _1 (a - 

JST 4 / + C) 


Section 

17.2 


1. — 2(t 4 l)/r 3 - lt/{t - l) 3 3. ^r 3/2 (2 + la 0 cos x> 4 2r sin xy 

5. dw/dr — 2x sin t — 2y cos t, dw/dt = 2 xr cos t + 2 yr sin t 7. dw/dt = 

( x - y)/(x 2 4 y 2 ), dw/dt = o(x sin t — y cos t)/(x 2 4 y 2 ) 9. dw/dr = 

(l/»v)(x/ 4 y sin r — zf sin r), dw/dt = (l/w)(xr 4 yr cos / 4 z cos r) 11. (a) 
du Du „ , du „ 09 dii „ , du n ... ,, , 

— = ■— cos 0 4 — sin 0, — = -r— sinfl + r— cos 0 (b) Solve 

dr Ox dy du dx dy 

the equations in (a) for (du/dx) and (du/dy) 13. (lay 4 3x 2 )/2(y — ax) 

15. — (e* sin y 4 e y cos x)/(e x cos y 4 e v sm x) 17. — y 2 e*/(l 4 ye x ) 

20 (a) x/r (b) -y (c) y/r (d) x 


Section 1. Of /Ox — 2x — 2 and Of /dy =* 6y are zero at ( x , y) — (1, 0) The matrix of 
second partiak ^ ts pour live u’ei'nrrfe Thas,f(t, v) = — I isa refative mmmftnrr 

3. dh/dx = 2y and dh/dy — 2x are zero at (x, y) = (0, 0) The matrix of second 
partials ^ is dinefimte Hence /( 0, 0) = 0 is neither a maximum nor a 
minimum 7. dh/dx i = 2x2 and dh/dx 2 = 2xi are zero at (xi, X2) = (0, 0) 
The matrix of second partials Q ^ is indefinite Hence /(0, 0) = 0 is neither a 

maximum nor a minimum 11. The function f(pi, pi) - pip 2 (\ — Pi — P 2 ) has 
a relaliLe maximum at p\ = p 2 « £ (In this case p 3 = 1 - pi - p 2 = 3, also ) 
There is no absolute maximum For instance, take pi = 2k + 1, p 2 = —k, and 
P3 *= - k , where k is any positive integer 15. (b) 8 k +’ 16/c(ci - 1) 4 
12A(co - 2) + 8*(ci - I) 2 4 24k(n - l)(c 2 - 2) 4 6 k(c 2 — 2) 2 4 
12A(c, - I) 2 (c 2 - 2) 4 12(ci ^ l)(c 2 - 2) 2 4 k(c 2 - 2) 3 
16 > ~ 195x/58 4 73/58 17. (b) The least squares line is 59»v = — 44x 4 

147y 4 248z 


Section 1. f(l, = % 3. when (x, y, w) = (||, 5. ^ when 

1 7.4 u ~ v = w = 2/y/Z 7. when C = ^ and L = 8 Differentiate 

the function F(x, >) — \(px + qy — B), where X is a Lagrange multiplier, to 


702 / ANSWERS TO SELECTED EXERCISES 



Section 

17.5 


obtain X — 

(l/p)(dF/3i 

: = (1 /<l)0F/3>) r 

S17 2 A „ gg 35 w hcn x 

67 6 

= and v = W 

1 3 
*• 2 

3. 121a 7 /5 

+ 20n 5 - 13o 3 . 

9 ‘ *[| " 

f! 

11. 7r/ 2 /4, one-fourth 

13. f = 1 

r,r 

d>dx IS. 5 + rr/3 - 

5 J 

o J v** 3/2 


17. 3w/2 




5, 4 In 3 

a 


6 (3tt/ 4) In 2 


m ji=? 
' 


dydzdx 


Section 

181 


Section 

182 


Section 

183 


Section 

184 


Section 
18 5 


Chapter 18 

„ ny, {xvz\ one element ( c ) 
1. Cali the men a, b, c and the women *•>' ^ dements 3. Let N denote 

{abc, abx, aby , abz, acx, acy, acz, box , >■ < Q NOONi ONON, ONNO, 

on and O denote off (a) E n f = f = = loOOOOOON, OONO, ONOO, A'000) 
OONN) (c) E U F = E (.d) E ’ . g and c a nd let 1, 2, and 3 

(e) E' n F = 5. "^’ n a d n ' d V tc Then each element of 0 is a hst of 

denote the respective links AB, A , x ( 2n \ 

possible links ft - (None. 1, 2, 3, 12, 13. 23, 123) 7. 00 +J 

/ \ a n r n c (g) n 5 n c ) u 

2 (aUUBUC (c) A' fl B' O C W ^ 2 „ 7> The chairman 

(^4 n B’ n C) U n B n O 4 - ^ ( 4, + 41 » 48 such arrangements 

is always pivotal in positions 4 or 5 There ^ by A are 12 such 

In addition, he is pivotal in position 3 if he P 
arrangements (48 + 12-60 tota 


Section 

18 6 


; n >, i\( ' 7* W Hi (b) 9 

1.0 464 3. tffH“ 0923 H'/Wl m '_ mt choose him W 

9 (a) j/yy - 1) He can choose any one ™ person must ^ ^ ( „ _ 

Vt -wor-v (0 (N - Di-t/ (w 

fc + D/CW - 6 Let C, be the evenUhat the rth 

* 4 ‘ ® 09 ^ „ 1 Theorem 7 to obtain (U^) 

photograph is correctly ma c problem is equivalent to 

k = 3, 4, 5, and 6, the answers are J, TSo- 720 

Example 5 ) n .. and /’(300500) = 0 38 

Order three lots 4 - L p(jVfj n M2) = _____ I ** ^ 

Then PCMi n M 2 1 Mi U Me) = TTMz) f + * Z (jj‘ Theorems 4 and 

, 1 s _ 48 p(x 1 I > 2 ) - 3T ' r-x .. O nrobablllty 

6 Partial answer P(xi | > 1 ) *»•* ’ ^ C fcFUC 

6 together with the facts that Tp^i 

function 11. Only (b) is true m genera 

rf W (1 


1. 0 66 3. (a) 

7. (a) K C=) 

(100 - jc7(99 - x) 
100 99 




- pv 




(b)^ 

1 2 Plan 3 - " 100 99 98 

100 " / ,ni 

sTO s ELE cT ED HX E nc,s E s / 

answers to 5“ 



section 1. As l mcreases, Pa - 1. 3- 1 - D - gWW" ».999999W 

18 7 5. (a) P(ofn contacted, at least one shows) = 1 (j) - P»- Choose n — ll to 

make p„ > 0 99. (b) ®® 3 - *■ W») ! . 9.0.76^ IS._ HE) - 

(„ + l)/2\ P(F) = 1 — 2/2", and f(Snf)" n/2". P(£nf) = P(E)P(F) is 
equivalent to 2*“ l = n + 1, which is true only when n = 3. 

Chapter 19 

Section 1. {77T} , {HHH, HUT, THH, HTH ) , {P7T) , {H7T, TTH, TUT) , fi. 2. P(A = 
19.1 0) = i(,P(iV = 1) - A,PW = 2) = ij, P(N= 3) - A, and P(N - jr) - Oir 

x is not 0, 1, 2, or 3. 3. If x < 0, P(H < at) = P(fl - 0. If 0 < x < 1, 

P(1V < *) = P(W = 0) - -TO If 1 < * < 2. P(N <x) = -fo. If 2 < x < 3, 
P(W< a:) = £}.If3 < at,P(V< a:) = P(fi) = I. 5.0.4,0.108,1. 7. (b) i, 

1 - 4e- 7 /5, 4(e“ 2 - e- 5 )/5, 4e- 2 /5. 

Section 1. *, A, A, *, A, A. A. A- 2- W = 0) = A, W - « - A. W - » “ 
19 ' 2 Jt, W - 3) - A- In general, P(V - A) = KJ (D*(5> 3 "‘ for C - 0, 1, 2, 3. 

9. (a) W ( 7 4 ) @ 4 © 3 , 00 Q (|) 5 (|) 2 , On) S ( (}) (D*(|) 7 -‘. 12. A 

4-engine plane is safer if and only if q < £ 14. (a) (b) 2 \° 3 . (c) 553. 

(d) As 15- 8 18. (P")‘(l - f*r*. 19. (a) j> 5 (b) A- 

20. Test III gives the strongest evidence in favor. 23. (a) (c) fj. (e) 


28. For k -= 0, 1, 2, ... , t, P(X = k) ■- 


(0 


(PAPp)*(l - Pa Pa) 1- *. 


Section 1. 7 2. P(X = 0) = P(3f = 1) - p, so E(X) = p • 1 + 9 • 0 » p. 

19.3 3. 2p 5. (a) £<2W 2 ) (c) £(s“) 6. (a) 8. (b) 0. (c) 4. 8. $2000. 

9. E(C) = JjJfl, Var (O - 1 580 12. (a) $2 50. (c) $1.90. 

15. E(.X/it) = p, Var (X/n) - p q /n. 
k = 1, 2, 3, , E(T ) ■ 


19. (a) P(P = k) . 

' Var (7) = - 1). (b) P(P = £) = l/„, t . 


VW 

■K^r 


1,2,. , n,E(T) - (n + l)/2, Var (7) - • " ^ V> . 

12 

(b) 3000 (c) 4000 or less, (d) 2000 (e) 0 or 1000. 


21. (a) 3000 


Section 1. P(Y - 1) = 0 425, P(Y = 2) = 0 188, P(Y = 3) = 0 158, P(Y = 4) = 0 109, 
19.4 P(7 - 5) = 0073, P{Y = 6) = 0047. 


4. (a) • 


P(T = k\ X - 2) \ 0 627 0 0 158 0114 0 066 0 035 

6- (a) tH* (b) • 


k 

5 6 7 

Probability 

7 7 1 

15 T? 15 


(b) 1 - © s (c) Q (f) 1 ® 4 . 11. (a) P( Z = *) = m »(2 

Q* +1 ), ft = 0, 1, 2 13. P(tl„ - r) = 0Lli±iy _ (KjzSJ, 


■1,2, ., V 


704 / ANSWERS TO SELECTED EXERCISES 



Section 
19 5 


Section 
19 6 


-n m nn- PIX = 1) = a "go, from Problem 8, 
1 1 4 20 34, 27 10 (a) E(X„) - A-*» - ,,,. _ 2-, 

Section 18 6 ’(b) E(T) - WO - «>; Var (T) = W»_+- ^ " Cho “ l 

13 Let p. = P(0.), - = 1.2, ■ »' T h en Cov (*, *,) M 

/- = l 18 Choose k = oo 19. Choose £ 3 g 

3 Binomial rp and rp(l - p), /-onX and X, 

Negative binomial rq/p andi tj/p , Vmfo / ( = £(j x-i) = £( s -ij x ) = 

n(k - l)/2, Var (73 = «(*.- - l)/ i2 y « x - u ; _ 

1 1 3Jt — r'r/Vb r l = J7 a (r :i ) 11 ^123 - ? (1) 

Ie(jT) = i gx(s) gavM = l 1 


4(1 - i=)- I,2 ),-i 


oo The series T, kP(X 
1=0 


, 1) ( representing E(T), diverges 


Section 

201 


Chapter 20 

„ ,, (0 28 0 72\ (0 275, 0 725) 

rr(2) = 7r(0)P- - ( 2 , 4) (q 27 0 73/ 

j\ n 0 0 0 \ 

x(4) = (0 27275, 0 72725) 7 (a)l 


,(3)- (0 2725, 0 7275) 


8 (a) P 


(St) 


Section 

20 2 

Section 

20 3 


Section 

20 4 


0 0 0 0 > 

p r q 0 0 

0 P r q 0 

0 ° P 0 1/ 

(c)tt(O) = (1,0), ir(l) = (° 2 ' 0 ,. 8) 1 ’ 11 1 I? ) O 8 8 8 0 88) 2 ’ 08 10 ParLl answer Togo 

tt( 4) = (0 1112,0 8888), ir(5) “ l®. 1 ” 1 ,’ n j p must hold their delegates, wh.le^ 

from state ABC to state AB, candidates A ^ a ^ , h e pro d u ct of which T2 
loses his The respective probabilities are 6 4- ] 0 \ , 3 (a) 

r i ■>) 

Problem 3 r 5 \ {5o} . and afC 

2. {Sr, Si, S 9 >, {Ss}, and {Ss, me' inTbal class forever 

open classes 9 (c) The chai , Each row is (&*• *’*’*£ 

1 Only the chain in Problem 4 “ r bem g in a particular compar 

*, *, A, A) The limiting ^ conipartme „, 3 (* f 

proportional to the number of entries even tual average annuli “ JjJ 

8 Expansion of present I— ,r — » ° f " h " h 
of $24 million The eventual total i-» 

go to Nichols • -b) This 


is to A, or the first transition is to ‘ 8 1/0 

= S 0 0331/d -09041) -55* 


, (1 _ OWI , 05 

el rCTED EXEHCISES / 

answers to selects 



i ^2 4 0^ (b) 3 (c) (0 \ 00 \ (d) 1 11. (a) 1724, 7 8, 41 1 

(b) 0000068 (c) ^ 14. Same answer as 12(b), since p„ = 0 for all non- 

absorbing states Use Problem 13 

Section 2. Solve the equation V = VP as in Section 20 3 4. Two trials, if the chain 

20.5 starts m cither Di or Du Otherwise, 2 ero 6. Proceed as in Problem 2 

7. The probability of transition to Si is 5 , whether the process is in state S 2 or in 
state S 3 However, the probability of transition to S 2 is not the same from Si as 
from S 3 

1. (a) Proceed as m Problem 25, Section 10 3, to obtain 

V) «»(: !::) 

2. (b) U» m T - ir(l - 0)"- 1 + (1 - (c) 7T 6. uT 1 T 2 is closer 

to As while 0 T 2 T 1 is closer to Ai 


Section 

20.6 


Chapter 21 


Section 

21.1 


1. (a) 0 084 (c) 0 735 2. (a) 0 000005 (c) 0 000855 

(b) 0 982 5. (a) 0 918 (b)2 5 (c) £ e ~* 5 " (2 5M>> - 

*=o k } 

(d) X/(l — e~ x ), XU — (X -f l)e" v ]/(l — e~ x ), where X = 2 5 
0 510, 0 019 13. (a) ( 1 ) 0 000001 ( 11 ) 0 000012 ( 111 ) 0 000474 

(u)0 857 (e) 1, 1 (00 897 


3. (a) 0 160 
.-2 sy (2 5 Ap* 

k' 

10. (a) 0 471, 
(c) ( 1 ) 0 744 


ection „ . ... e^CK/uY 

21 .2 *• Limiting probability of £* is , n = 0, 1, 2, 3 Limiting 

probability of £„ is (n/X + p)(X/X + »)\ n = 0, 1, 2, 6. />„(/) = e~*(l - 

c -Biy-\ g Qn (x/ M )» O0 Since X < p, o 0 > a n for n > 1 10. (a) f 

(b) 


Section 

21.3 


Section 

21.4 


1. £(71 = i Var(7) = i P(T < 2) = 1 - 2 . (a) 1 682 

4. (a) £(*) = (fl + a)/ 2, Var(*) = (0 - «) 2 /12 
(c) / 2 (=) = e~‘, z > 0 
= 0, 2 < 0 

6. (a) ^ (b) 5 8 /o = 3 In (|)« 2 749 12. e“ 18 « 0 00000 

t.65 (b) yjfif (C) THXC (d) (e) i£f 


(b) 1 975 


19. (a) 


1. 0 0008, 0 8392 2. 0 7679 4. Vie 1 ' 4 /2, ire 1 ' 2 /2 

(b) 0 1796 10. No 12 . Snack bar 372, Cafeteria 705 


5. (a) 0 3592 


70S / ANSWERS TO SELECTED EXERCISES 



author index 


Adelman, Irma, G , 640 
Anderson, K W,91, 118, 
293 

Anderson, N H, 387,451 
Anderson, T W , 592 
Atkinson, R C , 530, 646 
Audley, R J, 547, 656, 
658 

Bales, R F.588 
Ballenger, W , 663 
Bartle, R G, 89, 411 
Bartos, O J , 635, 641 
Baumol, W J , 313, 358, 
377 

Bavelas, A , 53 
Bayes, T , 542 
B «l, G M , 584 
Beatles, 9 
Bernoulli, J , 564 
Blalock, A B.220 
Blalock, H M , 220 
Blum, M L , 657 
Blumen, 1 , 632 
Bohlen, J M , 584 
Boole, G , 207 
Bortkiewicz, L v , 655 
Bo ' Vcr . G H , 144, 530, 
640 

Brems, H , 343, 475 
Bro*n, D L , 553 

R R , 73, 74, 103, 
JJ 7 * 19 0, 213, 327, 421, 
l 12 ' 545, 567, 640, 648, 
649 


Cahndra, A , 544 
Capkm,T,i 56 
Vartv.Tight, D , 213 
K . 306 

'-Bammah, A M , 574 


Chiang, A C , 454 
Clarke, R D , 679 
Coddmgton, E A , 454 
Cogan, E J , 74 
Coleman, J S , 212, 665, 
667 

Coombs, C H , 64 
Corcoran, D W J , 23 
Comsweet, T N , 453 
Cox, D R.664 
Cramer, G , 248 
Cronbach, L J , 535 
Crothers, E J , 530 

Davies, A D M , 173 
Davies, M G , 173 
Davis, J A , 44 
Deese, J , 446 
Descartes, R , 51 
Deutsch, K , 81, 565 
Devletoglou, E A , 101 
Dodd, S C , 435 
Domar, E , 454 

Edwards, W , 519 
Estes, W K , 545, 640 

Fabrycky, W J , 421 
Fechner, G , 380 
Feller, W, 553. 557, 614, 
639, 650, 689 
Fisher, R A , 571 
Freund, J E , 246 
Friedman, M , 133 

Gagni, R M . 437. 5 " 
Galantcr. E, 73, 74, 213, 
327, 567 

Galle, O R . 306 
Gass, S 1 .246 
Glescr, G C , 535 


Goldberg, S , 25, 176, 388, 
389, 557, 614 
Goldhamer, H , 625, 638 
Good, R A , 91, 176, 293 
Graham, C H , 437, 511 
Greenberger, M , 20 
Guilford, J P. 380 <106 
Gulhksen, H A, 111. 287, 
496 

Haight, F A , 144 
Hake. H W , 533 
Hall, D W.91, 118,293 
Harary, F.213 
Harrah, D , 82 
Harrod, R . 358, 376, 
Hauser, P M , 103 
Hecht, S , 428 
Hirchman, I '51,46 
Holm, F E , 246, 269 
Homans, G C , 299 
Horst, P.213 
Hull, C L, 336. 443 
Humphreys, EG ,514 
Hyman, R , 372, 533 

Jonckheerc, A R • 547 

Kamiya, J , I 2 
Karlsson, G , 643 

Kau.l-.I95 630 

Kemcny, J ° • 

650 

Keyfitr.N.466 

Kintsch, 'V .13' 

Kogan. M . 632 
Kobn, R r • '- 8 
Korbel, J ,2° 

Krencras, G . 645 

KroPP.P- 3 - 5 ,., 

Kuder. G r. 513 


AUTHO 


r INDEX / 207 



Labovitz, S , 213 
Lagrange, J L , 493 
Langer, S K , 47 
Lazarsfeld, P F, 212, 592 
Leeman, C P , 644 
Leibniz, G W , 297 
Leontief, W , 209 
Longton, P A , 452 
Lorge, 1 , 554 
Luchins, A S, 5, 25, 47 
Luchins, E H , 5, 25, 47 
Luce, R D , 11, 58, 73,74, 
213, 327, 567, 596 


McCarty, P J , 632 
McGill, W J, 567, 676 
McGrath, J J , 361 
McLaurin, C , 459 


Madow, W G , 565 
Markov, A A , 61S 
Marshall, A W , 625, 638 
Massank, F , 82 
Matras, J , 103 
Metzler, L A , 482 
Miller, G A , 300 
Miller, N , 451 
Mintz, A , 657 
Mischler, E G, 116 
Mosteller, F, 103, 117, 
190, 421, 512, 648, 649 
Murchison, C , 428 

Neifcld, M R , 303 
Newman, E B , 618 
Newton, 1 , 297, 425, 459 
Norman, R Z , 74, 213 

O’Hanlon. J F Jr, 361 
Orcutt, G H . 20 


Parsons, T , 667 
Parzen, E, 176, 557, 614, 
689 

Pascal, G R , 522 
Pearson, K , 671 
Pfeiffer, P E, 551, 552, 
557 

Poffenberger, A T , 303 
Poisson, S D , 654 
Polya, G , 540 
Prais, S J , 631 


Raiffa, H , 58 
Raimo, K , 115 
Rainville, E D , 151, 468 
Rapoport, A , 476, 574 
Rashevsky, N, 212, 298, 
445 

Ratoosh, P , 82 
Restle, F , 14, 82 
Rice, D B , 16 
Richardson, L F , 298, 
344 

Richardson, M W , 513 
Rivlin, A M , 20 
Robinson, J A , 153 
Roby, T, 631 
Rogers, A , 635 
Rogow, A A , 11 
Russell, B , 4, 6, 8 


Samuelson, P A, 377, 482 
SchefT, T J , 576 
Scott, W , 526 
Shapfey, L S , 165 
Shils, E , 667 
Shubik, M , 165 
Sidman, M , 461 
Simon H A , 299 
Smith, H , 616 
Smith, V L , 16 


Smith, W L , 664 
Snell, J L, 47, 74, 630, 650 
Solomon, H , 554 
Sommer, R , 516 
Stein, S K , 354, 442, 506 
Stephan, F , 116 
Sternberg, S H , 545, 645 
Stoyva, J , 12 
Styon, G P , 616 
Suppes, P , 25, 47, 73, 74, 
327, 646 
Suttell, B , 522 
Symonds, G H , 245 


Taeuber, K E , 306 
Taylor, B , 350 
Thomas, G B , 354, 442, 
506 

Thompson, G L , 47, 74 
Thurstone, L L , 193, 269, 
300 

Torgersen, P E , 421 
Torgerson, W S , 62 


Vernon, P E , 529 


Wallis, J , 425 
Weber, E , 380 
Widder.D V , 680 
Wilder, R L , 5, 25 


Yanouzas, J N , 306 
Yoesting, D R , 584 


Zannetos, Z S , 172 
Zelinsky, D , 213, 269 
Zmnes, J L,73, 74, 327 
Zipf.G K , 81 


703 / AUTHOR INDEX 



SUBJECT INDEX 


Absolute value, 83-88 
as distance, 85 
function, 83 
of a number, 83 
Adjacency matrix, 205 
Alternating senes See 
Senes, alternating 
Alternating senes test, 142 
Annuity, 124 
Antecedent See 

Conditional statement 
Antidenvatne, 402 
table, 404 

Arc sm function See 
Function, inverse sine 
Archimedean property, 90, 
105 
Area, 401 
Argument, 40 
forms of valid, 40 
indirect, 42 
Associate e law 
f or statements, 35 
for sets, 17 
Average 

°f a set of numbers, 124 
ra te or change, 297 
revenue, 283, 310 
s P«d, 294, 295 
Axis 

horizontal, or x, 77 
'■ertical, or y, 77 

Base 

Cl Sht, 137 

f°r logarithms, 333 
0r the number system, 
133-38 
Sl *teen, 138 
ten, 136 
three, 136 
*o, 135 


Bayes’ theorem, 542 
Bernoulli trials, 564-74 
Beta random variable See 
Random variable 
Betweenness 

for points, 233 
for sets, 14 
for vectors, 234 
Biconditional statement, 2 
Binary arithmetic table 
for addition, 136 
for multiplication, 136 
Binary digit See Bit 
Binary expansion See Base 
two 

Binary relation, 48, 50 
Binomial 

coefficients, 167-76 
expansions, 460 

Expansion Theorem, 1 

random variable See 
Random variable 
Birth and death processes, 

660-69 
Bit, 136 

Boolean arithmetic, 20 
Bound Q . 

lower, of a sequence, 94 

lower, of a set, 89 

upper, of a sequence, 94 

upper, of a set, 88 
Bounded Qi 

above, for sequences, 
above, for sets, 88 
below, for sequences, ^ 
below, for sets, 90 
sequence, 95 

Burfe^Mebt model. 454 

Byte, 138 


Cartesian product. 


51. 152 


Central Limit Theorem, 

687 

Certain event See Event, 
certain 

Chain Rule, 316 

for two variables, 479 
for several variables, 477 
Change of variable 
in integration, 430 

m a quadratic form, 262 

1Q summation notation, 
121 

Characteristic 
equation, 256 
polynomial, 256 
root, 255 
value, 255 
vector, 255 
Class See Set 
Cobweb model, 101 
Collection See Set 
Combinations, 161 
Combinatorial analysis, 

Common logarithm 

Logarithm, common 

Communication class, 68 

for^Marltov chain, 624 

CoTmunfation relation, 

fo^Markov chains, 624 

Commutative !a'« 
for sets, 17 

Compamble^ente 015 
relation. 63 
Comparison 
senes, 13a 

Compl cmcn * 

function, 79 

IB JECT INDEX / 7M 



laws, for sets, 17 
of a set, 10 
Complementary 
equation, 382 
solution, 382 

Composite function See 
Function, composite 
Conclusion, of an argu- 
ment, 40 

Conditional probability, 
531-47 
function, 533 
Conditional statement, 28 
Conjunction, of state- 
ments, 28 

Consequent See Condi- 
tional statement 
Contagion model, 436 
( See also Pure birth 
process, contagion m) 
Contrapositive, 37 
Convergence See Se- 
quences, Series 
Converse, of an implica- 
tion, 36 

Comer point of a solution 
set, 232 

Correlation, 604 
Counting 152-67 
criterion, 153 
principles, 152, 154 
Covariance, of two random 
variables, 601 
Cramer's Rule, 248 


Decimal expansion See 
Base ten 

De Morgan’s laws 
for sets, 17 
for statements, 35 
generalized, 21 
Density function See 
Random variable 
Derivative 297, 503, 504 
directional, 471 
higher order, 347 
of a constant function, 
301 

of identity function, 301 
second, 347 
Determinant, 247-52 
function, 248, 249 
minor, 256 
principal minor, 257 


Deviations, of numbers 
from their average, 
124 

Diagonal matrix See 
Matrix, diagonal 
Difference 
equation, 355, 356, 
370-89 

backward, 377 
forward, 377 
initial conditions for, 
372 

linear, 370 
order of, 371 
particular solution of, 
382 

solution of, 370 
transient part of solu- 
tion of, 384 
function, 356 
of two functions, 280 
of two sequences, 95 
of two sets, 16 
second, 356 
table, 357 
Differential 
calculus, 294-354 
equations, 298, 443-54 
linear, 447 

Differentiation .See Deriv- 
ative 

Diffusion model, 436 
Digraph See Directed 
graph 

Directed graph, 205 8 
Discrete data, 355 
Disjoint sets, 13 
Disjunction, of statements, 
28 

Distance, 76, 82 
between vectors, 196 
Distribution function See 
Random variable 
Distributive laws 
for sets, 17 
for statements, 35 
generalized 20 
Divergence See Sequences, 
Senes 

Domain of discourse See 
Set, universal 
Dummy variable See Sum- 
mation notation 

e, constant, 117, 333 


Economic multiplier, 128 
Eigenvalue See Character- 
istic root 

Eigenvector See Charac 
tenstic vector 
Elasticity of demand, 319, 
365 
Element 
of a set, 3 
of a vector, 187 
Ehthorn Perceptual Maze 
Test, 173 

Empty set See Set, empty 
Equations, linear See 
Linear equations 
Equilibrium 
of a system, 344 
state, 384 
Equivalence, 34 
class, 54 

logical, as an equivalence 
relation, 60 

Equivalent statements, 35 
Event (s), 514 
certain, 514 
compound, 514 
impossible 514 
independent, 547, 548 
occurrence of, 514 
pairwise independent, 
550 

simple, 514 
Exclusive or, 28 
Expected value See Ran- 
dom variable 

Exponential function See 
Function, exponential 
Extinction process, 662 
Extreme point, of a set, 232 

Factor Analysis, 193 
basic equation of, 194 
Factorial, 139, 155 

moment generating func- 
tion, 674 
Fallacy, 41 
Feasible vector, 239 
Flow, diagram, 68 
Function, 75-83 
additive, 293 
as a binary relation, 75 
as a machine, 80 
composite, 315 
concave, 348 
continuous, 288 


710 / SUBJECT INDEX 



Function (con’t ) 
convex, 348 
decreasing, 310 
domain of, 78 
etementary, 273 
even, 293 
exponential, 333 
identity, 79 
increasing, 310 
information, 596 
mtegrable, 401 
inverse, 327 
inverse sine, 329 
multiplicative, 293 
odd, 293 

of n variables, 470 
range of, 78 
rational, 433 
real-valued, 272 
square root, 272 
squaring, 272 
value of, 78 

Fundamental Theorem of 
Calculus, 407 

Gamma random variable 
See Random variable 
Generating function, 

607-14 
bivariate, 613 
conditional, 613 
moment, 674 
Probability, 608 
Geometric random vari- 
able See Random 
variable 
Graph, 77 

Greatest lower bound 
(gib), 90 

Grouping (of series), 132 


Identifiable linear function, 

221 

Identity, 27 
laws, for sets, 17 
matrix, 193 

number, for addition, 17 
number, for multiplica- 
tion, 17 
relation, 62 
set, for intersection, 17 
set, for union, 17 
Image 

of a point, 79 
of a set, 332 
Implication, 36 

as a weak ordering, 60 
Inclusive or, 28 

Incomplete beta, 671 
Independence of path, 618 
Independent increments, 

652 

Indifference curve, 497 
Infinite series See Series 
Inflection, point of, 348 

Input-output analysis, 
208-10 

Instantaneous 

rate of change, 297 
speed, 295 
Integral(s), 390-442 
convergent, 438 
definite, 401 
double, 500 
improper, 438 
indefinite, 403 
iterated, 498, 500 
lower sum for, 397 
table of, 423 
upper sum for, 396 
Integrand, 401 


Hair plane, closed, 231 
Hexadecimal digit See 
Byte 

Hexadecimal numbers See 
Base sixteen 
ypergeometnc random 
Enable See Random 

variable 

Hypothesis testing, 685 

(dempotent 
jaws, for sets, 17 
, *• r °r statements, 35 
rc >ation, 57 


:ts, 421-26 

jstitution, 426-33 
of, 401 

equilibrium See 
mtief system 
point, of a set, 339 
diate value theo- 


Intersection 

of two relations 
of two sets, 10 
of a collection 
12 , 20 

Interval, 133 


,207 
of sets, 


Invariant vector See 
Characteristic vector 
Isomorphism, 327 


Jacobian, 504 
Jacobian matrix, 503 


Lagrange multipliers, 493 
Latent roots See Charac- 
teristic roots 

Latent vector See Charac- 
teristic vector 
Least-squares line, 489 
Least upper bound (lub), 

89 

Least Upper Bound Ax- 
iom, 89 

Leibniz notation, for de- 
rivatives, 297 
Leontief system, 209 
closed, 209 

mtenor equilibrium ol a, 
210 

open, 210 
viable, 210 
Limit, 286 

of a sequence, 105 

of a real-valued function, 

278 r>n 

Linear combination 220 
Linear equat.ons, 214-19 

equivalent systems of, 

217 „f 

inconsistent system of, 

particular solution or, 
216 

solution of, 214 

Uncar.ncqual,t.cs, 231 

Linear programming, 

237-45 , 

Linear transformat.on.26Z 

Logarithm, 333 
common, 333 
function, 333 
natural, 333 

^classical. or Aristotelian, 

L0e 5e y se^on.rad,euon 

ujnlly true statement 
6 S " Tautology 

EX / 7” 


SUBJECT INC 



Logistic curve, 436 
Loop, of a relation, 70 

Mapping See Function 
Marginal 
cost, 297, 416 
product, 298 
revenue, 282, 298 
utility, 298 

Markov chain, 615-50 
absorbing state of, 624, 
635 

canonical form for, 627 
expansion of a, 634 
flow diagram for, 626 
initial probabilities in a, 
616 

lutnpable, 645 
n-step transition proba- 
bilities for, 620 
periodic, 628 
regular, 629 
states of, 616 
transition probabilities 
for, 616 

Matching problem, 528 
Mathematical induction, 
97-104 

Matrices, 180-269 
addition of, 182 
additive inverse of, I83 
blocked, 200 
column, 223 
diagonal, 204 
dimensions of, 180 
elemental^ column oper- 
ations on, 223 
elementary row, 221 
elementary row opera- 
tions on, 220 
equality of, 181 
inverse, 224-30 
computation of, 228 
main diagonal of, 181 
minor, 256 
multiplication of, 191 
negative of, 1 83 
nonsingular, 225 
notation for, 180 
orthogonal, 263 
partitioned, 200 
powers of, 198 
preference, 181 
rank of, 259 
singular, 225 


skew-symmetric, 268 
square, 181 
square of, 198 
square root of, 265 
stochastic, 618 
sub-, 200 
symmetric, 260 
trace of, 187 
transpose of, 185 
zero, 162, 193 
zeroth power of, 198 
Maximum 
of a set, 88 

of an order relation, 73 
Maximum and minimum 
value theorem, 289 
Maximum likelihood, 571 
Mean See Average 
Mean value See Random 
variable, expected 
value 

Mean Value Theorem, 307, 
309 

for partial derivatives, 
474 

Mental tests, 496 
Metric, 82 

Minimax, of a function, 
487 

Minimum 

of an order relation, 73 
of a set, 87 

Modulo arithmetic, 73 
Monotone Convergence 
Theorem, 112 
Multinomial Expansion 
Theorem, 172 
Multinomial trials, 591 
Multiplicative inverse, of a 
number Sire Recipro- 
cal 

Multiplier See Economic 
multiplier 

Multiplier-acceleration 

principle, 377 

Necessary Condition, 38 
Negation, of a statement, 
27 

Normal distribution, 680- 
89 

Null set See Set, empty 
Number 

of bridge hands, 164 
of ordered r samples, 160 


O C curve See Operating 
characteristic curve 
Octal numbers See Base 
eight 

Operating characteristic 
curve, 546 
Operator, 363 
Ordered pair, 48 
Ordered samples, 159 


Partial derivatives, 469-77 
Partial fractions, 433-38 
Partition, 98 
of a set, 55 
of an integer, 172 
Pascal’s Triangle, 171 
Permanent income, 133 
Permutations, 155, 161 
PERT technique, 607 
Pivotal committee mem- 
ber, 165 

Poisson Process, 651-60 
independence assump- 
tion for, 652 
stationarity assumption 
for, 651 

Poisson random variable 
See Random variable 
Polya Urn Scheme, 540, 
623 

Polynomial, 282 
derivative operator, 447 
difference operator, 362 
Population, 519 
Power, of a test procedure, 
686 

Power curve, 686 
Power senes, 455-68 
Premise(s), 40 
inconsistent, 47 
Present value, 418 
of an annuity, 124 
Probability, 516 
a posteriori, 543 
a priori, 543 
conditional See Con- 
ditional probability 
multiplication rule for, 
548, 550 
vector, 618 

Probability function, 516, 
559 

conditional, 589 
joint, 586, 587 


712 / SUBJECT INDEX 



Product 

of functions, 280 
of a matrix and a scalar, 
184 

of relations, 56 
of sequences, 95 
of vectors, 190 
Proper subset, 9 
Pure birth process, 664, 
667 

contagion in, 667 
Pure death process, 662 
Pythagorean Theorem, 76 


Recursive formula, 93 
Reduction formula, 424 
Reduction of order, 372 
Relations, 48-74 
asymmetric, 57 
connected, 64 
consistent, 61 
empty, 52 
equivalence, 53, 56 
indifference, 56 
induced ordering, 69 
irreflexive, 57 
loop-free, 70 
maximal element of, 73 


Quadratic form, 261 
indefinite, 267 
negative definite, 267 
positive definite, 267 
Queue, 664 
Quotient 

of functions, 280 
of sequences, 95 


Radius of convergence, 464 
Random vanable(s), 558 
absolutely continuous, 

670 n 
beta, 671 
binomial, 565 
expected value of, 579 
continuous, 562, 669 
density function of, 670 
discrete, 562 

distribution function of, 
559, 669 

expected value, 574-86 
exponential, 669, 679 
gamma, 688 
geometric, 566 
bypergeometnc, 570 
^dependent, 589 
jointly distributed, 586 
median value of, 677 
negative binomial, 568 
normal, 680 
Poisson, 654 
^angular, 670 
standardized, 585, 681 
nncorrclated, 604 
Uniform, 585, 670 
variance of, 518, 674 
*ando m walk, 615 
* 3tl o test, 139 
R Oi Pr0C ai (of a number). 


n-ary, 48 

partial ordering, 63 
preference, 55, 57, 58 
reflexive, 53 
simple ordering, 64 
symmetric, 53 
ternary, 48 
transitive, 53 
union of, 207 
universal, 52 
weak ordering, 57 
Relational system, 327 
Relative frequency, 521 
Relative maximum, 339 
of function of n vari- 
ables, 485 

Relative minimum, 339 
of a function of n vari- 
ables, 485 
Reliability 
function, 496 
of a test, 607 

Rolle’s Theorem, 307 

Russell’s Paradox, 4, b, e, 
26 


Saddle point, of a function, 

487 

Sample, unordered, 

161 


iace, 510 


r-sample, 159 

without replacement. 


. 159 

with replacement, 

160 

Scalar, 184 
matrix. 195 


Scale, 327 

Self contradiction, 27 
Semantic information func- 


tion, 82 
Semiorder, 73 
Sentential connectives, 28 
Separation of variables. 

Sequences, 92-97, 104-18 
constant, 94, 95, 106 
convergent, 105 
decreasing, 94 
divergence to infinity for. 


285 

divergent, 105 
double, 148 
geometric, 114 
harmonic, 94, 106 
increasing, 94 
monotone, 112 
p deleted, 276 
sub-, 95, 108 
value of, 92 
Senes, 125-33, 138-51 
absolutely convergent, 
HO 

alternating, 142 
conditionally conver- 
gent, 145 

convergent, 126 
divergent, 126 

double, 14S 
geometric, 127 
harmonic, 128 
hyperharmomc, I3U 

McLaurin, 459 
partial sums of, 126 

tail, 133 

terms of, 126 


of, 3, 4 


7 

valence rela- 


>, 3 

i.60 


sum of, 17 


X / 713 


subject inde 



symmetric difference of, 
16 

union, 10, 12, 20 
universal, 9 

Shapley Shubic measure of 
voting power, 165 
Significance level, of a test 
procedure, 685 
Slope, of a line, 296 
Spearman-Brown formula, 
111,276 
Square root, 87 
Stability, of a system, 345 
Standard deviation, 583 
Statistical Abstract of the 
United States, 75 
Statistics, 571 
Stimulus sampling theory, 
646 

Structure, mathematical, 
363 

Successive elimination of 
variables, 214 
Sufficient condition, 38 
Sum(s) 
double, 146 

of an infinite series, 126 
of two functions, 280 
of two matrices, 182 
of two sequences, 95 
Rule for interchange of 
order in, 147 


Rules for, 122-23 
Summation notation, 119- 
25 

dummy variable m, 121 
Supremum See Least 
upper bound 
Symbolic Logic, 26-47 

Tautology, 27 
Taylor Formula, 455 
for two variables, 484 
for n variables, 482 
Taylor polynomial, 455 
Taylor series, 457 
Taylor’s Theorem, 350, 455 
Ternary expansion See 
Base three 
Transformation, 79 
Tree diagram, 155 
Trials, independent, 552 
Triangle inequality, 86, 1 10 
for vectors, 190 
Trigonometric substitu- 
tion, 429 

Truth set, of a statement, 
27 

Truth value, 26 

Unfolding technique, 64 
U N Security Council, 
167 

Utility, 58 


Utility theory, 58 
ordinal assumption of, 
58 

Valid argument, 40-47 
forms of, 42 

Variable of summation, 
119 

Vector(s), 187 
addition of, 189 
column, 187 
components of, 187 
dimension of, 187 
geometric addition of, 
189 

length of, 188 
orthogonal, 191, 263 
perpendicular, 191 
row, 187 

Venn diagram, 10 

Volume, 499 

Voting power, in U N 
Security Council, 167 

Wallis’ formula for jr, 425 

Well defined set, 5, 26 

Well Ordering Principle, 
97, 98 

x-axis, 77 

y axis, 77 


714 / SUBJECT INDEX 



